Part I: What Are Metadata?

11 May 2016

Dave Piscitello

The concept of metadata is both simple and complicated. We readily understand what data are: they are the information that we communicate, process or consume in our ever-growing digitized society. But what are metadata?

Metadata: Data About Data

Data, especially digital data, take many forms. Voice conversations, text messaging or social media communicate data. Digital banking or merchant transactions involve the transfer of data. Web content, digitized and streamed entertainment, databases or information repositories of all kinds are examples of publications of data.

Metadata describe what these data are: they provide information about these data. That's pretty simple. However, if we dig a bit deeper we find that "describing" data is both a rigorous technical exercise and socio-politically charged issue. In this Part I, I'll explain what metadata are in a technical, quasi-scholarly manner.

What Kinds of Data Are Metadata?

Metadata provide a means to classify, organize, and characterize data or content. The National Information Standards Organization (NISO) provides a taxonomy that can be applied to all kinds of data or to data repositories, from libraries to web sites, for textual and non-textual data, in digitized or material forms.

NISO describes three types of metadata.

Descriptive metadata include information such as points of contact, the title or author of a publication, an abstract of a work, keywords used in a work, a geographic location, or even an explanation of methodology. These data are useful to discover, collect or group resources according to characteristics the resources share. To appreciate how descriptive metadata relate to informational data, visit the Business and Consumer Surveys pages hosted by the European Commission of Economic and Financial Affairs. In addition to the survey data, you can also obtain the BCS Metadata for each EU member country's survey, for example, France. The metadata files identify contact data, methodology, and date for each survey, but they do not contain the question or response data collected during the conduct of the survey itself.

Structural metadata explain how a resource is composed or organized. A digitized book, for example, can be published as individual page images, PDF or HTML files. These pages or component parts might typically be grouped into chapters. The chapter data, table of contents, or page layout details are considered structural metadata. A structural map of the pages or other resources of a web site, security intrusion event record types or voice call detail records are also kinds of structural metadata.

Administrative metadata are used to manage a resource. Creation or acquisition dates, access permissions, rights or provenance, or guidelines for disposition such as retention or removal are examples of rights that a digital archivist, curator, might employ. Similar metadata would be relevant for a database administrator, or for administrators responsible for capturing telecommunications or data network traffic flows or security log and event data.

We've Only Scratched The Surface

Now that you've seen several kinds of metadata, you can appreciate how useful metadata can be for any party, organization, or government agency that collects, aggregates, manages, or retains metadata on a large scale. You may also appreciate how activities that involve metadata collection on a large scale can be sources of controversy. We'll cover these in the next post in this series.

Dave Piscitello