File Management

File Systems are an integral part of any modern operating system with a pattern whereby most proprietary operating system support a single file system while Linux (the poster child of open source) supports a number of file systems. The role of the file system as the operating systems way of storing and organizing data cannot be disputed. However the increased capacity of storage media such as the hard disk and flash based media requires additional capabilities that perhaps build on the file system.

Anyone who keeps a significant and varied amount of data on a recently purchased computer would be interested in searching his locally stored data using the same approach that is used by internet search engines like Google. There are a number of desktop search products available and Google offers the Google Desktop that indexes data on a hard disk and allow subsequent searching. My experience with Google Desktop has been good though upgrading between versions has caused some problems in the past. I have heard complains of how much of a resource hog it is but that is not something that I can say I have experience; the initial indexing will take sometime that is why it would take a lot of resources.

File Systems & Desktop Search

File systems need to provide extensive meta data about the files that they store and more importantly perhaps include additional services that will analyze and group these meta data in a manner that makes sense to a user. For example, it should be possible and easy to group documents that concern a particular company or business proposal into a single unit regardless of whether these files are Word documents, spreadsheets or PDF documents. This is important for the future usability of computer since hard disk capacity has been increasing and soon enough we will have 1 terabyte hard disks which represents a lot of information to keep in folders. Microsoft’s ambitious WinFS was rumoured to offer such capabilities but that particular “pillar” of Windows Vista has been removed and some of its technologies have been integrated into other Microsoft projects.

File Systems

  • FAT: this is perhaps the most widely used and simplest file systems of all. It was created by Microsoft and used in consumer version of Windows up to Windows Me. Most PC OSes support FAT which makes it the common demoniator for such tasks as data sharing across disparate operating system platforms as well as for use in removable media such as floppy disks or flash disks. There are three version of FAT (FAT12, FAT16 and FAT32) with support for varying sizes of disks and file, long file names. The file system uses File Allocation Tables to keep track of which areas of the disks have data stored in them, which areas are free and which are potentially unusable for data storage. FAT file system tends to fragment as it scatters data across the disk; this reduces the performance of the file system and makes defragmentation (on a regularly basis) necessary. FAT32 supports a maximum volume size of 8 TB (terabyte).
  • NTFS (New Technology File System): this is the file system that is used in newer versions of Microsoft’s Windows operating system; all versions of Windows based on the NT kernel (Windows NT, Windows 2000, Windows XP, Windows Server 2003 and Windows Vista) use NTFS as the primary file system though this may not be the case with Home editions of the some of the OS releases or perhaps the basic editions (where applicable). NTFS include features such as support for metadata (file attributes) and use of advanced data structures to improve performance, reliability and disk utilization. NTFS includes support for Access Control Lists and file system journaling which enables the OS to recover from potentially damaging events like power blackouts that threaten the integrity of the file system.
  • Ext3 (the third extended file system): is an open source file ssytem commonly used by default in most Linux distributions. It was created by Stephen Tweedie and includes journaling capabilities. It is an extension of ext2; it adds journaling and tree based directory indices over ext2. Because of its close relationship with ext2, it is easy to upgrade to ext3 from ext2 and most of the ext2 tools will continue to work on ext3. However this close link to ext2 is a disadvantage because ext3 lacks features that are available in most modern files systems such as dynamic allocation of i-nodes. Ext3 lacks and online defragmentation tool …

    Further Reading, more

Interesting Concepts (will be expounded as time passes)

Extent: this is a contigous area of storage allocated for the storage of a file.


B+tree Data Structure:

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: