Text in the Computer Age

Table of Contents

  1. A Quick Introduction to Data Formats
  2. Text for Humans, Text for Computers

A Quick Introduction to Data Formats

Throughout this course, we will come across and create digital documents, in various data formats. Most of those will be stored in plain-text formats, and readable and editable in the text editor you were asked to install beforehand. Before we get started, a very brief demonstration of what happens when we store documents onto our computer can be useful. To that end, we’ll conduct a quick experiment with file extensions, which will hopefully make clear how important it is to consider data formats at the outset of a research and/or digitisation project, to help safeguard the interoperability and long-term sustainability of the data we produce and share. To try out the experiment on your own, you can download the materials that were used by clicking on the button below.1

Workshop Materials: data-formats.zip (<200kb)

Text for Humans, Text for Computers

Now that we know a bit more about how data is stored and accessed on the computer, we need to learn how the computer might help us analyse those data, or infuse them with meaning. To do so, we need to need to understand what the difference is between the way in which a computer reads and ‘understands’ texts, and the way we do.

Students taking this workshop as part of their larger course on digital text analysis at the Uppsala University will already have had an introduction of how (digital) text is understood by humans. After a brief summary of the most important points of that introduction, this section will focus on the way digital texts are read by computers instead. This will help us understand how we might translate our human understanding of texts to the computer.2