The problem with existing file systems is that they were never designed to deal with Big Data, which results in a lot of fragmentation that makes IT that much more difficult to manage.
To address that issue, Intel today is releasing an enterprise edition of Lustre, an open source distributed parallel file system that masks the complexities associated with accessing files in parallel.
According to Brent Gorda, general manager of the High Performance Data Division for Intel, Lustre supersedes the file system that comes with an operating system, which was never designed to handle massive files.
Gorda says Lustre takes advantage of parallelization to support tens of thousands of client systems and tens of petabytes of storage at speeds of well over 1 terabyte per second. As a POSIX-compliant file system, Gorda says Lustre also automatically updates all files in parallel.
In addition, Gorda notes that Lustre is compatible with multiple public cloud computing platforms, including Amazon Web Services, and comes with an adapter for Big Data environments based on the Intel Distribution for Apache Hadoop.
In all the enthusiasm for Big Data, there is a tendency to overlook some of the more mundane storage management issues that can make these projects pretty complex. Of course, this is an issue that any High Performance Computing (HPC) environment has long since dealt with. Now it’s just time for everybody working with Big Data in the enterprise to come to terms with the same issue.