Metadata (singular: metadatum) (sometimes called Metainformation) is "data about data", of any sort in any media. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items.
The word meta comes from the Greek, where it means 'after' or 'beyond'. Meta (from Greek: μετά = "after" "beyond" "with" "adjacent" is a prefix used in English in order to indicate a concept In epistemology, the prefix meta- is used to mean about (its own category); thus metadata is 'data about the data'. Epistemology (from Greek επιστήμη - episteme, "knowledge" + λόγος, " Logos " or theory of knowledge
Metadata (sometimes written 'meta data') are used to facilitate the understanding, characteristics, and management usage of data. The metadata required for effective data management varies with the type of data and context of use. In a library, where the data are the content of the titles stocked, metadata about a title would typically include a description of the content, the author, the publication date and the physical location. A library is a collection of information sources resources and services and the structure in which it is housed it is organized for use and maintained by a public body an institution An author is defined both as "the person who originates or gives existence to anything" and that authorship determines responsibility for what is created
In the context of a camera, where the data are the photographic image, metadata would typically include the date the photograph was taken and details of the camera settings (lens, focal length, aperture, shutter timing, white balance, etc. A camera is a device used to capture images either as still Photographs or as sequences of moving images ( Movies or Videos. A photograph (often shortened to photo) is an Image created by Light falling on a light-sensitive surface usually Photographic film or an electronic ). On a portable music player such as an iPod, the album names, song titles and album art embedded in the music files are used to generate the artist and song listings, and are considered the metadata. iPod is a popular brand of Portable media players designed and marketed by Apple Inc
In the context of an information system, where the data are the content of the computer files, metadata about an individual data item would typically include the name of the field and its length. The term information system (IS sometimes refers to a System of persons Data records and activities that process the data and Information in an organization A computer is a Machine that manipulates data according to a list of instructions. Metadata about a collection of data items, a computer file, might typically include the name of the file, the type of file and the name of the data administrator.
Metadata provides context for data.
If we consider a particular place in the real world, this may be described by data, for example:
To make sense of and use this data, context is important, and can be provided by metadata. The metadata for the above three items of data might include:
An item of metadata is itself data and therefore may have its own metadata. For example, "Post Code" might have the following metadata:
"27 June 2006" might have the following metadata:
The hierarchy of metadata descriptions can go on forever, but usually context or semantic understanding makes extensively detailed explanations unnecessary.
The role played by any particular datum depends on the context. Debt AIDS Trade in Africa (or DATA) is a Multinational non-government organization founded in January 2002 in London by U2 's For example, when considering the geography of London, "E83BJ" would be a datum and "Post Code" would be metadatum. But, when considering the data management of an automated system that manages geographical data, "Post Code" might be a datum and then "data item name" and "5 characters, starting with A – Z" would be metadata.
In any particular context, metadata characterizes the data it describes, not the entity described by that data. So, in relation to "E83BJ", the datum "is in London" is a further description of the place in the real world which has the post code "E83BJ", not of the code itself. Therefore, although it is providing information connected to "E83BJ" (telling us that this is the post code of a place in London), this would not normally be considered metadata, as it is describing "E83BJ" qua place in the real world and not qua data.
The term was introduced intuitively, without a formal definition. Because of that, today there are various definitions. The most common one is the literal translation:
Example: "12345" is data, and with no additional context is meaningless. When "12345" is given a meaningful name (metadata) of "ZIP code", one can understand (at least in the United States, and further placing "ZIP code" within the context of a postal address) that "12345" refers to the General Electric plant in Schenectady, New York. The ZIP code is the system of Postal codes used by the United States Postal Service (USPS The United States of America —commonly referred to as the An address is a Code and abstract concept expressing the fixed location of a home business or other building on the earth's surface Schenectady (skəˈnɛktədi Θkahnéhtati in Tuscarora) is a City in Schenectady County, New York, United States
As for most people the difference between data and information is merely a philosophical one of no relevance in practical use, other definitions are:
There are more sophisticated definitions, such as:
These are used more rarely because they tend to concentrate on one purpose of metadata — to find "objects", "entities" or "resources" — and ignore others, such as using metadata to optimize compression algorithms, or to perform additional computations using the data.
The metadata concept has been extended into the world of systems to include any "data about data": the names of tables, columns, programs, and the like. Different views of this "system metadata" are detailed below, but beyond that is the recognition that metadata can describe all aspects of systems: data, activities, people and organizations involved, locations of data and processes, access methods, limitations, timing and events, as well as motivation and rules.
Fundamentally, then, metadata is "the data that describe the structure and workings of an organization's use of information, and which describe the systems it uses to manage that information". To do a model of metadata is to do an "Enterprise model" of the information technology industry itself. Enterprise modeling is the process of understanding an enterprise business and improving its performance through creation of enterprise models [4]
In the context of the web and the work of the W3C in providing markup technologies of HTML, XML and SGML the concept of metadata has specific context that is perhaps clearer than in other information domains. HTML, an initialism of HyperText Markup Language, is the predominant Markup language for Web pages It provides a means to describe the structure Don't change "Extensible" The Standard Generalized Markup Language ( ISO 88791986 SGML) is an ISO Standard Metalanguage in which one can define Markup languages With markup technologies there is metadata, markup and data content. The metadata describes characteristics about the data, while the markup identifies the specific type of data content and acts as a container for that document instance. This page in Wikipedia is itself an example of such usage, where the textual information is data, how it is packaged, linked, referenced, styled and displayed is markup and aspects and characteristics of that markup are metadata set globally across Wikipedia.
In the context of markup the metadata is architected to allow optimization of document instances to contain only a minimum amount of metadata, while the metadata itself is likely referenced externally such as in a schema definition (XSD) instance. XML Schema, published as a W3C recommendation in May 2001 is one of several XML schema languages. Also it should be noted that markup provides specialised mechanisms that handle referential data, again avoiding confusion over what is metadata or data, and allowing optimizations. The reference and ID mechanisms in markup allowing reference links between related data items, and links to data items that can then be repeated about a data item, such as an address or product details. These are then all themselves simply more data items and markup instances rather than metadata.
Similarly there are concepts such as classifications, ontologies and associations for which markup mechanisms are provided. A data item can then be linked to such categories via markup and hence providing a clean delineation between what is metadata, and actual data instances. Therefore the concepts and descriptions in a classification would be metadata, but the actual classification entry for a data item is simply another data instance.
Some examples can illustrate the points here. Items in bold are data content, in italic are metadata, normal text items are all markup.
The two examples show in-line use of metadata within markup relating to a data instance (XML) compared to simple markup (HTML).
A simple HTML instance example:
<span style="normalText">Example</span>
And then a XML instance example with metadata:
<PersonMiddleName nillable="true">John</PersonMiddleName>
Where the inline assertion that a person's middle name may be an empty data item is metadata about the data item. HTML, an initialism of HyperText Markup Language, is the predominant Markup language for Web pages It provides a means to describe the structure Don't change "Extensible" Such definitions however are usually not placed inline in XML. Instead these definitions are moved away into the schema definition that contains the metadata for the entire document instance. This again illustrates another important aspect of metadata in the context of markup. The metadata is optimally defined only once for a collection of data instances. Hence repeated items of markup are rarely metadata, but rather more markup data instances themselves.
When structured into a hierarchical arrangement, metadata is more properly called an ontology or schema. An ontology in both Computer science and Information science is a formal representation of a set of concepts within a domain and the relationships between Both terms describe "what exists" for some purpose or to enable some action. For instance, the arrangement of subject headings in a library catalog serves not only as a guide to finding books on a particular subject in the stacks, but also as a guide to what subjects "exist" in the library's own ontology and how more specialized topics are related to or derived from the more general subject headings.
Metadata is frequently stored in a central location and used to help organizations standardize their data. This information is typically stored in a metadata registry. A metadata registry is a central location in an organization where Metadata definitions are stored and maintained in a controlled method
Usually it is not possible to distinguish between (plain) data and metadata because:
These considerations apply no matter which of the above definitions is considered, except where explicit markup is used to denote what is data and what is metadata.
Metadata has many different applications; this section lists some of the most common.
Metadata is used to speed up and enrich searching for resources. In general, search queries using metadata can save users from performing more complex filter operations manually. It is now common for web browsers (with the notable exception of Mozilla Firefox), P2P applications and media management software to automatically download and locally cache metadata, to improve the speed at which files can be accessed and searched.
Metadata may also be associated to files manually. This is often the case with documents which are scanned into a document storage repository such as FileNet or Documentum. Once the documents have been converted into an electronic format a user brings the image up in a viewer application, manually reads the document and keys values into an online application to be stored in a metadata repository.
Metadata provide additional information to users of the data it describes. This information may be descriptive ("These pictures were taken by children in the school's third grade class. ") or algorithmic ("Checksum=139F").
Metadata helps to bridge the semantic gap. The semantic gap characterizes the difference between two descriptions of an object by different linguistic representations for instance languages or symbols By telling a computer how data items are related and how these relations can be evaluated automatically, it becomes possible to process even more complex filter and search operations. For example, if a search engine understands that "Van Gogh" was a "Dutch painter", it can answer a search query on "Dutch painters" with a link to a web page about Vincent Van Gogh, although the exact words "Dutch painters" never occur on that page. This approach, called knowledge representation, is of special interest to the semantic web and artificial intelligence. The Semantic Web is an evolving extension of the World Wide Web in which the Semantics of information and services on the web is defined making it possible for the
Certain metadata is designed to optimize lossy compression. A lossy compression method is one where compressing data and then decompressing it retrieves data that may well be different from the original but is close enough to be useful For example, if a video has metadata that allows a computer to tell foreground from background, the latter can be compressed more aggressively to achieve a higher compression rate.
Some metadata is intended to enable variable content presentation. For example, if a picture has metadata that indicates the most important region — the one where there is a person — an image viewer on a small screen, such as on a mobile phone's, can narrow the picture to that region and thus show the user the most interesting details. A similar kind of metadata is intended to allow blind people to access diagrams and pictures, by converting them for special output devices or reading their description using text-to-speech software. Speech synthesis is the artificial production of human speech.
Other descriptive metadata can be used to automate workflows. For example, if a "smart" software tool knows content and structure of data, it can convert it automatically and pass it to another "smart" tool as input. As a result, users save the many copy-and-paste operations required when analyzing data with "dumb" tools. For a pejorative meaning see Cut and paste job In Human-computer interaction, cut and paste and copy and paste offer
Metadata is becoming an increasingly important part of electronic discovery. Electronic discovery, or "e-discovery" refers to discovery in Civil litigation which deals with information in electronic format also referred [1] Application and file system metadata derived from electronic documents and files can be important evidence. An electronic document is any Electronic media Content (other than Computer programs or system Files that are intended to be used in either an Recent changes to the Federal Rules of Civil Procedure make metadata routinely discoverable as part of civil litigation. The Federal Rules of Civil Procedure (FRCP are rules governing Civil procedure in United States district (federal courts that is court procedures for Civil Civil law, as opposed to Criminal law, refers to that branch of Law dealing with disputes between Individuals and/or Organizations, in which Parties to litigation are required to maintain and produce metadata as part of discovery, and spoliation of metadata can lead to sanctions. In Law, discovery is the pre-trial phase in a Lawsuit in which each party through the law of Civil procedure can request documents and other evidence Lawyers and Courts use the term spoliation to refer to the intentional or negligent withholding hiding or destruction of evidence
Metadata has become important on the World Wide Web because of the need to find useful information from the mass of information available. The World Wide Web (commonly shortened to the Web) is a system of interlinked Hypertext documents accessed via the Internet. Manually-created metadata adds value because it ensures consistency. If a web page about a certain topic contains a word or phrase, then all web pages about that topic should contain that same word or phrase. Metadata also ensures variety, so that if a topic goes by two names each will be used. For example, an article about "sport utility vehicles" would also be tagged "4 wheel drives", "4WDs" and "four wheel drives", as this is how SUVs are known in some countries. A sport utility vehicle ( SUV) is a generic marketing description for a rugged automotive vehicle similar to a Station wagon but built on a light-truck chassis A tag is a non-hierarchical keyword or term assigned to a piece of information (such as an Internet bookmark, digital image or Computer file)
Examples of metadata for an audio CD include the MusicBrainz project and All Media Guide's All Music Guide. A Compact Disc (also known as a CD) is an Optical disc used to store digital data, originally developed for storing digital audio MusicBrainz is a project that aims to create an Open content Music database All Media Guide (commonly known as AMG) is the company which owns and maintains Allmusic, Allgame and Allmovie. allmusic (previously All Music Guide) is a Metadata database about music owned by All Media Guide. Similarly, MP3 files have metadata tags in a format called ID3. MPEG-1 Audio Layer 3, more commonly referred to as MP3, is a Digital audio encoding format using a form of Lossy data compression ID3 is a Metadata container most often used in conjunction with the MP3 Audio file format.
Metadata can be classified by:
To successfully develop and use metadata, several important issues should be treated with care:
Microsoft Office files include metadata beyond their printable content, such as the original author's name, the creation date of the document, and the amount of time spent editing it. Microsoft Office is a set of interrelated desktop applications servers and services collectively referred to as an Office suite, for the Microsoft Windows and Unintentional disclosure can be awkward or even, in professional practices requiring confidentiality, raise malpractice concerns. Some of Microsoft Office document's metadata can be seen by clicking File then Properties from the program's menu. Other metadata is not visible except through external analysis of a file, such as is done in forensics. The author of the Microsoft Word-based Melissa computer virus in 1999 was caught due to Word metadata that uniquely identified the computer used to create the original infected document. The Melissa worm, also known as "Mailissa" " Simpsons " " Kwyjibo " or "Kwejeebo" is a mass-mailing macro virus,
Even in the early phases of planning and designing it is necessary to keep track of all metadata created. It is not economical to start attaching metadata only after the production process has been completed. For example, if metadata created by a digital camera at recording time is not stored immediately, it may have to be restored afterwards manually with great effort. Therefore, it is necessary for different groups of resource producers to cooperate using compatible methods and standards.
Metadata can be stored either internally, in the same file as the data, or externally, in a separate file. Metadata that are embedded with content is called embedded metadata. A data repository typically stores the metadata detached from the data. Both ways have advantages and disadvantages:
Moreover, there is the question of data format: storing metadata in a human-readable format such as XML can be useful because users can understand and edit it without specialized tools. The Resource Description Framework (RDF is a family of World Wide Web Consortium (W3C Specifications originally designed as a Metadata Data On the other hand, these formats are not optimized for storage capacity; it may be useful to store metadata in a binary, non-human-readable format instead to speed up transfer and save memory.
Although the majority of computer scientists see metadata as a chance for better interoperability, some critics argue:
The opposers of metadata sometimes use the term metacrap to refer to the unsolved problems of metadata in some scenarios. Metacrap is a Portmanteau drawn from Metadata and crap. The origin of the word is unknown but it was popularized by Cory Doctorow in a These people are also referred to as "Meta Haters. "
In general, there are two distinct classes of metadata: structural or control metadata and guide metadata. [5] Structural metadata is used to describe the structure of computer systems such as tables, columns and indexes. Guide metadata is used to help humans find specific items and is usually expressed as a set of keywords in a natural language.
Metatadata can be divided into 3 distinct categories:
Each relational database system has its own mechanisms for storing metadata. A relational database is a Database that groups data using common attributes found in the data set Examples of relational-database metadata include:
In database terminology, this set of metadata is referred to as the catalog. The database catalog of a Database instance consists of Metadata in which definitions of database objects such as Basis tables view tables Synonyms The SQL standard specifies a uniform means to access the catalog, called the INFORMATION_SCHEMA, but not all databases implement it, even if they implement other aspects of the SQL standard. For an example of database-specific metadata access methods, see Oracle metadata. The ORACLE application server and Oracle Relational database keep Metadata in two areas data dictionary tables (accessed by built-in functions and a metadata registry
Data warehouse metadata systems are sometimes separated into two sections:
Kimball[6] lists the following types of metadata in a data warehouse (See also [2]):
Michael Bracket defines metadata (what he calls "Data resource data") as "any data about the organization's data resource". A data warehouse is a Repository of an organization's electronically stored data Extract Transform and Load ( ETL) is a process in data warehousing that involves extracting data from outside sources Online transaction processing, or OLTP, refers to a class of systems that facilitate and manage transaction-oriented applications typically for data entry and retrieval A repository is a place where Data or specimens are stored and maintained for future retrieval A Logical Schema is a Data model of a specific problem domain expressed in terms of a particular data management technology An access method is a function of a mainframe Operating system that enables access to data on disk tape or other external devices Data acquisition is the sampling of the real world to generate data that can be manipulated by a computer Data transmission is the transfer of Data from point-to-point often represented as an Electro-magnetic Signal over a physical point-to-point or point-to-multipoint In Data warehousing, a dimension table is one of the set of companion tables to a Fact table. A surrogate key in a Database is a unique identifier for either an entity in the modeled world or an object in the database A program transformation is any operation that takes a program and generates another program A database management system ( DBMS) is Computer software designed for the purpose of managing Databases DBMSes may use any of a variety of Data models [7] Adrienne Tannenbaum defines metadata as "the detailed description of instance data. The format and characteristics of populated instance data: instances and values, dependent on the role of the metadata recipient". [8] These definitions are characteristic of the "data about data" definition.
Business Intelligence is the process of analyzing large amounts of corporate data, usually stored in large databases such as the Data Warehouse, tracking business performance, detecting patterns and trends, and helping enterprise business users make better decisions. Business intelligence ( BI) refers to technologies applications and practices for the collection integration analysis and presentation of business Information and A data warehouse is a Repository of an organization's electronically stored data Business Intelligence metadata describes how data is queried, filtered, analyzed, and displayed in Business Intelligence software tools, such as Reporting tools, OLAP tools, Data Mining tools.
Examples:
Business Intelligence metadata can be used to understand how corporate financial reports reported to Wall Street are calculated, how the revenue, expense and profit are aggregated from individual sales transactions stored in the data warehouse. Online Analytical Processing or OLAP (ˈoʊlæp is an approach to quickly provide answers to analytical queries that are multi-dimensional in nature Data mining is the process of Sorting through large amounts of data and picking out relevant information Wall Street is a street in lower Manhattan, New York City, United States. A good understanding of Business Intelligence metadata is required to solve complex problems such as compliance with corporate governance standards, such as Sarbanes Oxley (SOX) or Basel II. The Sarbanes-Oxley Act of 2002 ( also known as the Public Company Accounting Reform and Investor Protection Act of 2002 and commonly called SOX or Sarbox
In contrast, David Marco, another metadata theorist, defines metadata as "all physical data and knowledge from inside and outside an organization, including information about the physical data, technical and business processes, rules and constraints of the data, and structures of the data used by a corporation. "[9] Others have included web services, systems and interfaces. In fact, the entire Zachman framework (see Enterprise Architecture) can be represented as metadata. The term Enterprise Architecture, refers to many things Like architecture in general it can refer to a description a process or a profession [10]
Notice that such definitions expand metadata's scope considerably, to encompass most or all of the data required by the Management Information Systems capability. Management Information System ( MIS) is a subset of the overall Internal controls of a business covering the application of people documents technologies and procedures In this sense, the concept of metadata has significant overlaps with the ITIL concept of a Configuration Management Database (CMDB), and also with disciplines such as Enterprise Architecture and IT portfolio management. The Information Technology A Configuration management database (CMDB is a repository of information related to all the components of an Information system. The term Enterprise Architecture, refers to many things Like architecture in general it can refer to a description a process or a profession IT portfolio management is the application of systematic management to large classes of items managed by enterprise Information Technology (IT capabilities
This broader definition of metadata has precedent. Third generation corporate repository products (such as those eventually merged into the CA Advantage line) not only store information about data definitions (COBOL copybooks, DBMS schema), but also about the programs accessing those data structures, and the Job Control Language and batch job infrastructure dependencies as well. Job Control Language ( JCL) is a Scripting language used on IBM mainframe operating systems to instruct the system on how to run a batch job or These products (some of which are still in production) can provide a very complete picture of a mainframe computing environment, supporting exactly the kinds of impact analysis required for ITIL-based processes such as Incident and Change Management. The Information Technology Change Management is an IT Service Management discipline The objective of Change Management in this context is to ensure that standardized methods and procedures are used The ITIL Back Catalogue includes the Data Management volume which recognizes the role of these metadata products on the mainframe, posing the CMDB as the distributed computing equivalent. The Information Technology A Configuration management database (CMDB is a repository of information related to all the components of an Information system. CMDB vendors however have generally not expanded their scope to include data definitions, and metadata solutions are also available in the distributed world. Determining the appropriate role and scope for each is thus a challenge for large IT organizations requiring the services of both.
Since metadata is pervasive, centralized attempts at tracking it need to focus on the most highly leveraged assets. Enterprise Assets may only constitute a small percentage of the entire IT portfolio.
Some practitioners have successfully managed IT metadata using the Dublin Core metamodel. The Dublin Core Metadata element set is a standard for cross-domain information resource description [11]
First generation data dictionary/metadata repository tools would be those only supporting a specific DBMS, such as IDMS's IDD (integrated data dictionary), the IMS Data Dictionary, and ADABAS's Predict. A database management system ( DBMS) is Computer software designed for the purpose of managing Databases DBMSes may use any of a variety of Data models IDMS (Integrated Database Management System is a ( network) CODASYL Database management system first developed at B IBM Information Management System ( IMS) is a joint Hierarchical database and Information management system with extensive Transaction processing ADABAS ( Acronym for Adaptable DAta BAse System is Software AG ’s primary Database management system
Second generation would be ASG's DATAMANAGER product which could support many different file and DBMS types.
Third generation repository products became briefly popular in the early 1990s along with the rise of widespread use of RDBMS engines such as IBM's DB2. A Relational database management system (RDBMS is a Database management system (DBMS that is based on the Relational model as introduced by E DB2 is one of IBM 's families of Relational database management system (RDBMS (or as IBM now calls it data server software products within IBM's broader Information
Fourth generation products link the repository with more Extract, transform, load tools and can be connected with architectural modeling tools. Extract Transform and Load ( ETL) is a process in data warehousing that involves extracting data from outside sources Examples include Adaptive Metadata Manager from Adaptive, Rochade from ASG,InfoLibrarian Metadata Integration Framework and Troux Technologies Metis Server product. Troux Technologies (pronounced "true" is a global provider of IT Governance software to accelerate IT and business transformation
Nearly all file systems keep metadata about files out-of-band. In Computing, a file system (often also written as filesystem) is a method for storing and organizing Computer files and the data they contain to make Out-of-band is a technical term with different uses in Communications and Telecommunication. Some systems keep metadata in directory entries; others in specialized structure like inodes or even in the name of a file. In Computing, a directory, catalog, folder or drawer is an entity in a File system, which contains a group of files and/or other directories In Computing, an inode is a Data structure on a traditional Unix -style File system such as UFS. Metadata can range from simple timestamps, mode bits, and other special-purpose information used by the implementation itself, to icons and free-text comments, to arbitrary attribute-value pairs. On Computer displays, a computer icon is a small Pictogram. Icons have been used to supplement the normal alphanumerics of the computer A name–value pair or attribute–value pair is a fundamental Data representation in computing systems and applications
With more complex and open-ended metadata, it becomes useful to search for files based on the metadata contents. The Unix find utility was an early example, although inefficient when scanning hundreds of thousands of files on a modern computer system. Unix (officially trademarked as UNIX, sometimes also written as Unix with Small caps) is a computer The find program is a directory search utility on Unix-like platforms Apple Computer's Mac OS X operating system supports cataloguing and searching for file metadata through a feature known as Spotlight, as of version 10.4. Apple Inc, ( formerly Apple Computer Inc, is an American Multinational corporation with a focus on designing and manufacturing Consumer electronics Mac OS X (mæk oʊ ɛs tɛn is a line of computer Operating systems developed marketed and sold by Apple Inc, the latest of which is pre-loaded on all currently Spotlight is a system-wide Desktop search feature of Apple's Mac OS X Operating system introduced in version 10 Mac OS X version 104 “Tiger” was the fifth major release of Mac OS X, Apple’s desktop and server Operating system for Macintosh Microsoft worked in the development of similar functionality with the Instant Search system in Windows Vista, as well as being present in SharePoint Server. Microsoft Corporation is an American multinational Computer technology Corporation, which rose to dominate the Home computer Windows Vista (formerly codenamed Longhorn) has many new features compared with previous Microsoft Windows versions covering most aspects of the operating Windows Vista (ˈvɪstə is a line of Operating systems developed by Microsoft for use on Personal computers including home and business desktops Linux implements file metadata using extended file attributes. Linux (commonly pronounced ˈlɪnəks Extended file attributes is a File system feature that enables users to associate Computer files with Metadata not interpreted by the filesystem whereas
Examples of image files containing metadata include Exchangeable image file format (EXIF) and Tagged Image File Format (TIFF). Exchangeable image file format ( Exif) is a specification for the Image File format used by Digital cameras The specification uses the existing
Having metadata about images embedded in TIFF or EXIF files is one way of acquiring additional data about an image. Tagging pictures with subjects, related emotions, and other descriptive phrases helps Internet users find pictures easily rather than having to search through entire image collections. A tag is a non-hierarchical keyword or term assigned to a piece of information (such as an Internet bookmark, digital image or Computer file) A prime example of an image tagging service is Flickr, where users upload images and then describe the contents. Flickr is an image and video hosting Website, Web services suite and Online community platform Other patrons of the site can then search for those tags. Flickr uses a folksonomy: a free-text keyword system in which the community defines the vocabulary through use rather than through a controlled vocabulary. Folksonomy (also known as collaborative tagging, social classification, social indexing, and social tagging) is the practice and method Controlled vocabularies provide a way to organize knowledge for subsequent retrieval
Users can also tag photos for organization purposes using Adobe's Extensible Metadata Platform (XMP) language, for example. The Adobe Extensible Metadata Platform ( XMP) is a standard for processing and storing standardized and proprietary metadata created by Adobe Systems Inc
Digital photography is increasingly making use of technical metadata tags describing the conditions of exposure. Photographers shooting Camera RAW file formats can use applications such as Adobe Bridge or Apple Computer's Aperture to work with camera metadata for post-processing. A raw image file contains minimally processed data from the image sensor of a Digital camera or Image scanner. Adobe Bridge is an organizational program created and released by Adobe Systems as a part of the Adobe Creative Suite.
Metadata is casually used to describe the controlling data used in software architectures that are more abstract or configurable. Most executable file formats include what may be termed "metadata" that specifies certain, usually configurable, behavioral runtime characteristics. In Computing, an executable (file causes a computer "to perform indicated tasks according to encoded instructions," as opposed to a file that only contains In Computer science, runtime or run time describes the operation of a Computer program, the duration of its execution from beginning to termination However, it is difficult if not impossible to precisely distinguish program "metadata" from general aspects of stored-program computing architecture; if the machine reads it and acts upon it, it is a computational instruction, and the prefix "meta" has little significance. The von Neumann architecture is a design model for a stored-program Digital computer that uses a processing unit and a single separate storage structure In Computer science, an instruction is a single operation of a processor defined by an Instruction set architecture.
In Java, the class file format contains metadata used by the Java compiler and the Java virtual machine to dynamically link classes and to support reflection. In the Java Programming language, Source files (java files are Compiled into class files which have a. A Java compiler is a Compiler for the Java programming language. A Java Virtual Machine ( JVM) is a set of computer software programs and data structures which use a Virtual machine In Computer science, a library is a collection of Subroutines used to develop Software. In Object-oriented programming, a class is a Programming language construct that is used as a blueprint to create objects This blueprint includes attributes In Computer science, reflection is the process by which a Computer program can observe and modify its own structure and behavior The J2SE 5. Java Platform Standard Edition or Java SE is a widely used platform for Programming in the Java language 0 version of Java included a metadata facility to allow additional annotations that are used by development tools. The Metadata Facility for Java is a specification for Java that defines an API for annotating fields, methods, and classes A programming tool or software development tool is a program or application that Software developers use to create debug maintain or otherwise
In MS-DOS, the COM file format does not include metadata, while the EXE file and Windows PE formats do. MS-DOS (short for M icro' s' oft D isk O perating S ystem is an Operating system commercialized by Microsoft. EXE is the common Filename extension denoting an Executable file (a program) in the DOS, OpenVMS, Microsoft Windows, The Portable Executable (PE format is a File format for Executables object code, and DLLs used in 32-bit and 64-bit versions of Windows These metadata can include the company that published the program, the date the program was created, the version number and more.
In the Microsoft .NET executable format, extra metadata is included to allow reflection at runtime. In Computer science, reflection is the process by which a Computer program can observe and modify its own structure and behavior
Object Management Group (OMG) has defined metadata format for representing entire existing applications for the purposes of software mining, software modernization and software assurance. Object Management Group ( OMG) is a Consortium, originally aimed at setting standards for distributed Object-oriented systems and is now focused Software mining is a promising application of Knowledge discovery in the area of Software modernization which involves understanding existing software artifacts Software Modernization is the process of understanding and evolving existing Software assets This specification, called the OMG Knowledge Discovery Metamodel (KDM) is the OMG's foundation for "modeling in reverse". Knowledge Discovery Metamodel ( KDM) is publicly available specification from the Object Management Group (OMG KDM is a common language-independent intermediate representation that provides an integrated view of an entire enterprise application, including its behavior (program flow), data, and structure. One of the applications of KDM is Business Rules Mining.
Knowledge Discovery Metamodel includes a fine grained low-level representation (called "micro KDM"), suitable for performing static analysis of programs. Knowledge Discovery Metamodel ( KDM) is publicly available specification from the Object Management Group (OMG
Most programs that create documents, including Microsoft SharePoint, Microsoft Word and other Microsoft Office products, save metadata with the document files. Microsoft Word is Microsoft 's flagship word processing software. Microsoft Office is a set of interrelated desktop applications servers and services collectively referred to as an Office suite, for the Microsoft Windows and These metadata can contain the name of the person who created the file (obtained from the operating system), the name of the person who last edited the file, how many times the file has been printed, and even how many revisions have been made on the file. Other saved material, such as deleted text (saved in case of an undelete command), document comments and the like, is also commonly referred to as "metadata", and the inadvertent inclusion of this material in distributed files has sometimes led to undesirable disclosures.
Document Metadata is particularly important in legal environments where litigation can request this sensitive information (metadata) which can include many elements of private detrimental data. This data has been linked to multiple lawsuits that have got corporations into legal complications.
Many legal firms today use "Metadata Management Software", also known as "Metadata Removal Tools". This software can be used to clean documents before they are sent outside of their firm. This process, known as metadata management, protects lawfirms from potentially unsafe leaking of sensitive data through Electronic Discovery. Electronic discovery, or "e-discovery" refers to discovery in Civil litigation which deals with information in electronic format also referred
For a list of executable formats, see object file. In Computer science, object code, or an object file, is the representation of code that a Compiler or Assembler generates by processing
Metadata on Models are called Metamodels. This is the concept of metamodeling in Computer science and related disciplines In Model Driven Engineering, a Model has to conform to a given Metamodel. Model-driven engineering (MDE is a software development methodology which focuses on creating models or abstractions of something more tangible that describe the elements of a system Scientific modelling is the process of generating abstract, conceptual, Graphical and or mathematical models. This is the concept of metamodeling in Computer science and related disciplines According to the MDA guide, a metamodel is a model and each model conforms to a given metamodel. Model-driven architecture (MDA is a Software design approach for the development of Software systems It provides a set of guidelines for the structuring of specifications Meta-modeling allows strict and agile automatic processing of models and metamodels. This is the concept of metamodeling in Computer science and related disciplines
The Object Management Group (OMG) defines 4 layers of meta-modeling. Object Management Group ( OMG) is a Consortium, originally aimed at setting standards for distributed Object-oriented systems and is now focused Each level of modeling is defined, validated by the next layer:
Xegy uses metadata. Unified Modeling Language ( UML) is a standardized general-purpose Modeling language in the field of Software engineering. The Common Warehouse Metamodel ( CWM) is a specification for modeling Metadata for relational, non-relational multi-dimensional, and most other Knowledge Discovery Metamodel ( KDM) is publicly available specification from the Object Management Group (OMG The Meta-Object Facility ( MOF) is an Object Management Group (OMG standard for Model-driven engineering.
Since metadata are also data, it is possible to have metadata of metadata–"meta-metadata. " Machine-generated meta-metadata, such as the reversed index created by a free-text search engine, is generally not considered metadata, though.
There are three categories of metadata that are frequently used to describe objects in a digital library:[12]
Metadata that describe geographic objects (such as datasets, maps, features, or simply documents with a geospatial component) have a history going back to at least 1994 (refer MIT Library page on FGDC Metadata). This class of metadata is described more fully on the Geospatial metadata page. Geospatial metadata (also geographic metadata, or simply metadata when used in a geographic context is a type of Metadata that is applicable to objects