n. (abbr. WARC format)a file format for web archives that concatenates multiple web resources and their metadata into a single fileGalloway 2011, 173Currently the International Internet Preservation Consortium is supporting the development of the Web Archive (WARC) format, based on the Internet Archive’s crawler output format, for conversion of websites. Two current projects, the Living Web Archives and the World Wide Web of Humanities, are being carried out in Europe to develop a capture format that retains the visual and interactive features of websites for digital discovery and archiving.Lin et al. 2017Although WARC has emerged as the standard archival format, a substantial amount of legacy ARC data still exist, meaning that we must be able to load them. An additional complication is that WARCs and ARCs are often intermixed in archival file systems, meaning that one call must be able to ingest them interchangeably.LoC 2022bThe WARC (Web ARChive) format specifies a method for combining multiple digital resources into an aggregate archival file together with related information. The WARC format is a revision of the Internet Archive’s ARC File Format [ARC_IA] format that has traditionally been used to store “web crawls” as sequences of content blocks harvested from the World Wide Web. The WARC format generalizes the older format to better support the harvesting, access, and exchange needs of archiving organizations. Besides the primary content currently recorded, the revision accommodates related secondary content, such as assigned metadata, abbreviated duplicate detection events, later-date transformations, and segmentation of large resources. ¶ A WARC format file is the concatenation of one or more WARC records. A WARC record consists of a record header followed by a record content block and two newlines; the header has mandatory named fields that document the date, type, and length of the record and support the convenient retrieval of each harvested resource (file). There are eight types of WARC record: ‘warcinfo’, ‘response’, ‘resource’, ‘request’, ‘metadata’, ‘revisit’, ‘conversion’, and ‘continuation’. The content blocks in a WARC file may contain resources in any format; examples include the binary image or audiovisual files that may be embedded or linked to in HTML pages.Internet Archive 2023aWeb pages crawled by the Internet Archive are stored as WARC. This is a file format for concatenating several resources, each consisting of a set of simple text headers and an arbitrary data block, into one long file.