DATA STORAGE INFORMATION FROM EDITH HOPE CHAVULA(MLIS0225).
Data
Storage
Digital
curation is the continuous process of managing, preserving, and adding value to
digital assets throughout their lifecycle (Harvey & Bastian, 2020). A
critical phase within this lifecycle is data storage, which involves
maintaining digital objects in a secure, stable, and retrievable environment
(Digital Curation Centre, 2021). According to Higgins (2018), proper storage
ensures data integrity, prevents unauthorised access, and mitigates the risk of
long-term bit rot. Organisations and individuals must align their storage
practices with established international standards, such as ISO 14721, to
guarantee that information remains accessible and authentic over time
(International Organization for Standardisation 2012).
Types of data storage, challenges and
solutions
To balance performance and budget, modern
organizations deploy block, object, and file storage architectures (El-Haddad
et al., 2022). High-performance block such as storage Amazon Elastic Block
Store (EBS), Google Cloud Persistent Disk, and physical Storage Area Networks
(SANs) isolates data into raw pieces for ultra-low latency databases; however,
it suffers from high costs and metadata limitations, which enterprises solve
using automated data tiering and virtualization pools (El-Haddad et al., 2022).
Conversely, infinitely scalable object storage such
as Amazon Simple Storage Service (S3), Azure Blob Storage, and Google Cloud Storage,
uses a flat namespace and rich metadata tags to host massive unstructured data
lakes (GeeksforGeeks, 2026). While object storage faces read/write latency and
lacks incremental file editing capabilities, organizations resolve these
bottlenecks by adopting high-throughput all-flash arrays and edge-caching
delivery networks (GeeksforGeeks, 2026).
Finally, hierarchical file storage seen in
Network-Attached Storage (NAS) devices, local hard drives, and shared network
directories. offers intuitive folder navigation for office collaboration (Red
Hat, 2018). Because file systems experience severe performance degradation and
scaling bottlenecks as data volumes expand, modern enterprises deploy hybrid
cloud file systems that continuously synchronize local network hardware with
infinite cloud repositories (Red Hat, 2018).
Figure 1. Types of data storage
Storage Procedures and Standards
Data
Storage stores data in a secure and managed environment. According to Oliver
and Harvey (2016), this involves data ingestion, where files are validated,
virus-scanned, and assigned unique identifiers. Organisations must utilize
redundant storage systems, commonly achieved through the different strategy
like A, B and C. As stated by Smith (2022), this procedure requires keeping
three copies of data, stored on two different types of media, with one copy
located off-site. For example, a university repository might store student
records on a local server, back them up on an institutional hard drive, and
archive a third copy in a secure cloud environment. Furthermore, regular
integrity checks using cryptographic checksums, such as Message Digest 5 (MD5) or Secure Hash 256 (SHA-256) Algorithms, are mandatory to detect
file corruption (Pennock, 2019).
Figure 2. Secure Hash 256
Algorithm.
Compliance
and Legal Frameworks
Data
storage procedures must follow strict legal frameworks, specifically the
General Data Protection Regulation (GDPR). Under GDPR Article 5, organisations
must implement a storage limitation policy, keeping personal data identifiable
only for as long as necessary (European Union, 2016). Meeting these legal
mandates requires robust security measures, including AES-256 encryption for
data at rest and pseudonymisation, which involves separating identifying
personal details from other stored data points (European Data Protection Board,
2023)
Personal
and Organisational Data Storage
Storage
practices diverge significantly between personal and organizational contexts.
According to Marshall (2016), personal file storage often relies on individual
habits, utilising commercial cloud drives or external solid-state drives (SSDs)
to preserve personal tax documents. Conversely, organisational file storage
demands enterprise-level curation. According to Pinfield, Cox and Smith, (2014),
research institutions and corporations manage large-scale data repositories
using structured metadata schemas like Dublin Core. While an individual might
simply drag a personal folder into a cloud sync folder, an organisation must
enforce automated retention schedules, access-control lists and disaster
recovery protocols to protect corporate memory and proprietary assets.
Figure 3. SSD storage device.
In
conclusion, data storage within digital curation transcends the mere saving of
files onto a disk. It requires a systematic approach dictated by international
standards, rigorous security procedures like checksum verification, and strict
compliance with legal frameworks such as the GDPR. By understanding the
distinct operational needs of both personal and organisational files, curators
can successfully safeguard digital heritage against technological obsolescence
and security breaches.
References
Digital
Curation Centre. (2021). The DCC curation lifecycle model. Edinburgh
University Press.
El-Haddad, M., El-Sawy, A., & El-Khamy, S. E. (2022). Fault tolerance in big data storage and processing systems. Journal of King Saud University - Computer and Information Sciences, 34(3), 850–865. doi.org.
European Union. (2016). Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data. Official Journal of the European Union, L119, 1-88.
European Data Protection
Board. (2023). Guidelines 01/2023 on data protection by design and by
default. EDPB.
Geeks for Geeks.
(2026). Block, object, and file storage in system design.
geeksforgeeks.org
Higgins, S. (2018).
Digital Curation: The development of a discipline within information science. Journal of Documentation, 74(6),
1318-1338. https://doi.org/10.1108/JD-02-2018-0024
International Organisation
for Standardisation. (2012). Space data and information transfer systems —
Open archival information system (OAIS) — Reference model (ISO Standard No.
14721:2012).
Marshall, C. C. (2018). Reading
and writing the electronic book. Morgan & Claypool Publishers.
Oliver, Gillian, &
Harvey, D. (2016). Digital curation: A how-to-do-it manual (2rd ed.).
ALA Neal-Schuman.
Pinfield,
S., Cox, A. M., Smith, J. (2014). Research data management
and libraries: Relationships, activities, drivers, and influences. PLoS ONE,
9(12), e114734. https://doi.org/ 10.1371/journal.pone.0114734 [5]
Red
Hat. (2018). File storage, block storage, or object storage? redhat.com
Smith, J. (2022). The foundational principles of data backup and storage archiving. Academic Press.
Spichtinger, D., Siren,
J. (2017). The development of research data management policies in Horizon 2020. In Research data management policies
(pp. 11–24). De Gruyter. https://doi.org/ 10.1515/9783110365634-002
Nice one Edith
ReplyDeletenice one
ReplyDeleteWell presented
ReplyDeleteData storage is a critical phase indeed🔥
ReplyDeleteThis is a very a good write up. Thank you for the insights.
ReplyDeleteGood work
ReplyDeleteWell done
ReplyDeleteThat's great information on cloud storage systems
ReplyDelete