Abstract—Networks of computers areeverywhere. The internet is one of the most common example of it likewisedistributed system is a network that consist of autonomous computer that areconnected through a distributed middleware. In this paper four distributed filesystem architecture Google File System, Microsoft distributed file networkAndrew File System and Sun Network File System is reviewed on the basis of performance,Scalability, Data Integrity, Security and heterogeneity for the better understandingof different file system a comparative study is required.Keywords— DFS,GFS,SUN,AFS,GoogleFile System ,Sun Network File System,Andrew File System .
I. Introduction 1File System is referred to as file management andsometimes abbreviated ad FS, A file system is a method and data structure that anoperating system user to keep track of the files on a disk or partition, theword is also refer to a partition or disk that is used to store the file or thetype of file system. A file is a collection of related information that isrecorded on secondary storage. Or file is a collection of logical related entities. File system usually consist of filesseparated into groups called directories. There are many types of File systemwhich are commonly used to determine how data is accessed.Distributed file System or DFS is a file system is aclient/server-based application that allows clients to access and process datastored on the server as if it were ontheir own machine , when a user accessed a file on the server , the serversends the user a copy of the file, which is cached on the user’s computer whilethe data is being processed and then return to the server , a distributed filesystem organizes files and directory services of individual servers into aglobal directory in such a way that remote data access is not location-specificbut is identical from any client . All the files are requested by the by the userare located at different system at different places globally whenever any userrequest any service/file all the system simultaneously provideinformation/service to the Client.
Sharing of resources is the main motive ofthe DFS.A DFA operating system runs on multiple independentcomputers, connected through communication network, but appears to its user asa single virtual machine and runs its own os. Each computer node has its ownmemory. Internet, Intranet, Mobile and ubiquitous computing are the comeexamples of DFS.
Fig__ show the Architecture of a distributed file system II. LiteratureReview Aditya B. Patel, Manashvi Birla, Ushma Nair,”AddressingBig Data Problem Using Hadoop and Map Reduce”, NIRMA university internationalconference on engineering, nuicone, 06-08december, 2012.2 The Google File SystemSanjay Ghemawat, Howard Gobioff, and Shun-Tak LeungGoogle3 A REVIEW: Distributed File System International Journal ofComputer Networks and Communications Security VOL. 3, NO. 5, MAY 2015, 229–234 Shiva Asadianfam1, Mahboubeh Shamsi2 andshahrad kashany34 III.
DistributedFile System5A Distributed file system is aclient/server -based application that allows clients to access and process datastored on the server as it is on their local node, when user accesses a file onthe server, the server sends the user a copy of the file , which is cached onthe user’s computer while the data is being processed and is then returned tothe server. The Distributed file system are the bedrock of distributedcomputing in office/engineering environments. Fig-I Architectureof Distributed File System6 Features of Distributed file system7 v Transparency8Transparencyrefers to hiding details from a user, there are three types of transparency i. Structure transparencyMultiple fileservers are used to provide better performance, scalability, and reliability.The multiplicity of file servers should be transparent to the client of adistributed file system ii. Access transparencyLocal andremote files should be accessible in the same way. The file system shouldautomatically locate an accessed file and transport it to the client’s site iii. Naming transparencyThe name of thefile should not reveal the location of the file.
The name of the file must notbe changed while moving from one node to another. iv. Replication transparencyThe existenceof multiple copies and their locations should be hidden from the clients wherefiles are replicated on multiple nodes. v UserMobilityThe user is notbounded to work on a specific node but should have the flexibility to work onany given machine at different time. v PerformancePerformance ismeasured as the average amount of time needed to satisfy client requests, whichincludes CPU time plus the time for accessing secondary storage along withnetwork access time. Explicit file placement decisions should not be needed toincrease the performance of a distributed file system.
v DataIntegrityConcurrent accessrequests from multiple users who are competing to access the file must beproperly synchronized using some form of concurrency control mechanism. Atomictransactions can also be provided to users by a file system for data integrity. IV.
Charactersticsof Distributed File system 9 v ConcurrencyIt thecircumstances of happening two or more events at same time, how to handle thesharing of resources between clients/ Execution of concurrent programs shareresources: ex web pages, files, etc. v NoGlobal ClockIn adistributed system, Computers are connected through network and have their ownclocks. Communication/sharing between programs is only through messages andtheir coordination depends on time. v IndependentFailureEach component ofa distributed system can fail independently, leaving other system unaffectedv FaultTolerance.
Fault toleranceis the property of the system that continue operating properly in the event offailure. v ScalabilityScalability isthe capability of a system, network, or process to handle a growing amount of work,or its potential to be enlarged to accommodate that growth. v HeterogeneityHeterogeneitycomputing refers to system which use more than one kind of processor or cores. Thesesystems gain performance or energy efficiency but not just by adding the sametype processors also by adding dissimilar co-processor. v SecuritySecurity is oneof the most important principles, since security need to be pervasive throughthe system, security system is normally placed in distributed system. V. GoogleFile System10 Google file system is a highly scalable,distributed file system on expensive commodity hardware that provide faulttolerance and high aggregate performance and it delivers high aggregateperformance to many clients.
The design has been driven by observationof our application workloads, and technological environment, both current and anticipated,that reflect a marked department from some earlier file system assumptions.This has led to reexamine traditional choices and explore radically differentdesign points. The file system has successfully met the google storage platformfor the generation and processing of data.
The largest cluster of data provideshundred of terabytes of storage across thousand of disks on over a thousandmachines, and its concurrently accessed by hundreds of clients. GFS is one ofthe most successful example of real-time application of distributed system.With very high percentage of fault tolerance.