-
Abstract
To accommodate the explosively increasing amount of data in many areassuch as scientific computing and e-Business, physical storage devicesand control components have been separated from traditional computingsystems to become a scalable, intelligent storage subsystem that, whenappropriately designed, should provide transparent storage interface,effective data allocation, flexible and efficient storage management,and other impressive features. The design goals and desirable featuresof such a storage subsystem include high performance, high scalability,high availability, high reliability and high security. Extensiveresearch has been conducted in this field by researchers all over theworld, yet many issues still remain open and challenging. This paperstudies five different online massive storage systems and one offlinestorage system that we have developed with the research grant supportfrom China. The storage pool with multiple network-attached RAIDs avoidsexpensive store-and-forward data copying between the server and storagesystem, improving data transfer rate by a factor of 2--3 over atraditional disk array. Two types of high performance distributedstorage systems for local-area network storage are introduced in thepaper. One of them is the \it Virtual Interface Storage Architecture(VISA) where VI as a communication protocol replaces the TCP/IP protocolin the system. VISA's performance is shown to achieve better than thatof IP SAN by designing and implementing the vSCSI (VI-attached SCSI)protocol to support SCSI commands in the VI network. The other is afault-tolerant parallel virtual file system that is designed andimplemented to provide high I/O performance and high reliability. Aglobal distributed storage system for wide-area network storage isdiscussed in detail in the paper, where a Storage Service Provider isadded to provide storage service and plays the role of user agent forthe storage system. Object based Storage Systems not only store data butalso adopt the attributes and methods of objects that encapsulate thedata. The adaptive policy triggering mechanism (APTM), which borrowsproven machine learning techniques to improve the scalability of objectstorage systems, is the embodiment of the idea about smart storagedevice and facilitates the self-management of massive storage systems. Atypical offline massive storage system is used to backup data or storedocuments, for which the tape virtualization technology is discussed.Finally, a domain-based storage management framework for different typesof storage systems is presented in the paper.
-
-