4. Data Collections

Using Text Files

Using Text Files πŸ“„

Introduction

students, imagine you are building an app that keeps track of your class reading list, game scores, or club attendance. If you store that information only while the program is running, it disappears when the program ends. That is where text files become useful. A text file lets a program save and read information outside of memory, so data can be reused later. In AP Computer Science A, using text files connects directly to data collections because files often contain many values that must be read, stored, searched, and sometimes sorted.

In this lesson, you will learn how Java programs use text files, why they are important, and how they fit into the larger topic of data collections. You will also see how file data can be turned into arrays or ArrayList objects, which are common ways to organize large amounts of information. By the end, students, you should be able to explain the main ideas, use the correct terminology, and reason about simple file input tasks in AP-style questions. βœ…

Objectives

  • Explain the main ideas and terminology behind using text files.
  • Apply AP Computer Science A reasoning to file input tasks.
  • Connect text files to arrays, ArrayList, and other data collections.
  • Summarize how text files fit into the Data Collections unit.
  • Use examples and evidence to describe how programs work with files.

What a Text File Does

A text file is a file that stores characters in a readable form. Unlike a binary file, which stores data in a more compact machine-friendly format, a text file can be opened and read by a person using a simple editor. Examples include .txt files, CSV files, and other files where values are separated by spaces, commas, or line breaks.

In Java, text files are commonly read using classes such as Scanner, especially when the file contains words, numbers, or lines that need to be processed one by one. The key idea is that the file acts like a source of data, just like a keyboard input or a hard-coded list. Instead of typing values during program execution, the program reads them from the file.

For example, if a file contains:

$$

87

92

78

100

$$

a program can read those values and store them in an ArrayList<Integer> or an int[]. This is useful when the values represent quiz scores, temperatures, or survey results. πŸ“Š

A few important terms:

  • Input file: a file from which a program reads data.
  • Output file: a file to which a program writes data.
  • Delimiter: a character used to separate data, such as a comma or space.
  • End of file: the point where no more data remains to read.
  • Parsing: converting text into usable data types such as int or double.

Reading Data from a File

When a Java program reads from a text file, it often uses a Scanner created with the file as its source. The process is similar to reading from the keyboard, but the source is a file instead of System.in.

A common pattern looks like this:

$$

Scanner input = new Scanner(file);

$$

After that, the program may use methods such as hasNext(), hasNextInt(), next(), nextInt(), or nextLine() depending on the kind of data stored in the file. The method choice matters because each one reads different kinds of input.

For example, suppose a file named names.txt contains:

$$

Ava

Ben

Carlos

$$

The program could read each name with nextLine() and store the names in an ArrayList<String>. A simple algorithm is:

  1. Create the file scanner.
  2. Create an empty collection.
  3. Read data while more data exists.
  4. Add each value to the collection.
  5. Use the collection for processing.

This is a very common AP CSA procedure because it combines file input with collection management. The program can then count values, find maximums, search for a match, or print the contents in a different order.

A major idea to remember, students, is that the file itself is not the same as the collection. The file is the source of data, and the collection is the structure in memory that holds the data after it is read. That distinction is often tested. 🧠

Storing File Data in Collections

Text files and data collections work together closely. Once data is read from a file, it is often stored in an array or an ArrayList so the program can use it efficiently.

Arrays

An array has a fixed size, so it is a good choice when the number of values is known in advance. For example, if a file contains exactly $30$ quiz scores, the program might create an array of size $30$ and fill it as values are read. Arrays are common when the data count is known or when the program needs direct index access.

If the program has an array named scores, then scores[0] holds the first value, scores[1] holds the second, and so on. This is helpful when processing file data with loops.

ArrayList

An ArrayList is often more flexible because its size can grow as needed. If a file may contain any number of values, an ArrayList is a convenient choice. For example, a teacher’s attendance file might have a different number of names each day. Reading each line and adding it to an ArrayList<String> allows the program to store as many names as needed.

A program might do something like this conceptually:

  • read one value from the file
  • add it to the list
  • repeat until the file ends

This pattern is powerful because it supports unknown data sizes, which are common in real-world programs.

Real-world example

Imagine a school club stores meeting attendance in a file with one student name per line. The program reads every line into an ArrayList<String>, then checks whether students appears in the list. That means the file helps save information permanently, while the collection helps the program work with the data during execution.

Searching and Processing File Data

Once file data is in a collection, it can be searched or processed using standard AP CSA algorithms. Searching means looking for a particular value, while processing means doing something with each element, such as counting, summing, or finding extremes.

If a file contains test scores and the program stores them in an array, it can use a linear search to check whether a score of $90$ appears. It can also compute the average by adding all values and dividing by the number of scores:

$$

\text{average} = \frac{\text{sum of scores}}{\text{number of scores}}

$$

A linear search is appropriate because the data collection is not necessarily sorted. The program checks each item one at a time until it finds the target or reaches the end.

File data can also support sorting. A program might read names or scores from a file, place them in a collection, and then sort them for display. Sorting is useful when you want data to be easier to read, compare, or search later. For example, a list of student names from a file can be sorted alphabetically before printing a roster.

It is important to understand that the file-reading step usually happens first, and then the algorithmic work begins. In AP CSA questions, you may be asked to trace how values move from the file into the collection and then through a loop. πŸ”„

Common AP CSA Considerations

AP Computer Science A often checks whether students understand the behavior of file-reading code. Some common ideas include:

  • next() reads one token at a time, while nextLine() reads an entire line.
  • hasNext() or hasNextInt() helps prevent reading past the end of the file.
  • Data type matching is important. If the file contains numbers, the program must read them using the correct method.
  • Exceptions can occur if the file is missing or cannot be opened. In many classroom examples, this is handled with throws FileNotFoundException.

Example reasoning question: if a file contains the values $5$, $10$, and $15$, and the program uses a loop to add them to a total, the final total is:

$$

5 + 10 + 15 = 30

$$

That kind of reasoning is very similar to AP exam analysis. You do not need to memorize a huge amount of file syntax, but you do need to understand what the code is doing and why.

Another important idea is that file input can make programs more flexible. Instead of changing code every time the data changes, the program can read a new file. This is one reason text files are useful in real applications such as grading systems, inventory trackers, and logs.

How Text Files Fit into Data Collections

Text files are not just a separate topic. They are a way of getting data into collections. In the Data Collections unit, the main focus is on managing many values with arrays, ArrayList, and 2D arrays. Text files support that work by providing input data that can fill those structures.

Here is the connection:

  • A text file stores data outside the program.
  • The program reads the data into an array, ArrayList, or 2D array.
  • The collection allows searching, sorting, counting, and other operations.
  • The results can be displayed, analyzed, or written back out.

For example, a file might contain rows of scores for several students. The program could read the data into a 2D array, where each row represents a student and each column represents a quiz. That shows how file input can connect directly to multidimensional data structures.

This is why using text files matters in AP CSA. It supports persistence, flexibility, and realistic problem-solving. In many programs, file data is the starting point for everything else the program does.

Conclusion

students, text files are a practical way for programs to read and reuse data. They connect strongly to collections because data from a file is often stored in arrays or ArrayList objects before being processed. In AP Computer Science A, you should know the key terms, understand how file input works, and be able to reason about how data moves from a file into a collection and then through searching, sorting, or other algorithms.

When you see a file-related problem, ask yourself: What is being read? How is it stored? What collection is used? What happens after the data enters memory? If you can answer those questions, you are well prepared to connect using text files to the larger Data Collections topic. βœ…

Study Notes

  • A text file stores data as readable characters and can be used as input for a program.
  • In Java, file data is often read with a Scanner created from a file.
  • next(), nextInt(), and nextLine() read different kinds of input.
  • hasNext() and similar methods help a program know when more data is available.
  • Text file data is often stored in an array or ArrayList after reading.
  • Arrays have fixed size; ArrayList can grow as needed.
  • File data can be searched with linear search, processed in loops, and sorted after loading.
  • File input is commonly used with real-world data such as scores, names, attendance, and inventories.
  • Text files help programs reuse data without hard-coding values.
  • In Data Collections, text files are the starting point for many collection-based tasks.

Practice Quiz

5 questions to test your understanding