Data Representation in System Fundamentals

Welcome, students! In this lesson, you will explore how computers store, process, and communicate information using bits and bytes 💻📘. Data representation is one of the most important ideas in IB Computer Science SL because every image, sound, number, character, and instruction must be turned into binary for a computer to handle it. By the end of this lesson, you should be able to explain key terms, apply basic methods of data representation, and connect the topic to the wider area of System Fundamentals.

Learning objectives:

Explain the main ideas and terminology behind data representation
Apply IB Computer Science SL reasoning and procedures related to data representation
Connect data representation to system architecture and operation
Summarize how data representation fits within System Fundamentals
Use real examples to show how data representation works in computer systems

What data representation means

Computers do not understand text, pictures, or music directly. Instead, they use patterns of bits, where each bit is a $0$ or a $1$. A bit is the smallest unit of data in a computer system. Eight bits make a byte, so $1\ \text{byte} = 8\ \text{bits}$. Larger units are also used, such as kilobytes, megabytes, gigabytes, and terabytes.

Data representation is the process of converting information into a format that a computer can store, process, and transmit. This happens all the time. For example, when students types a message, the letters are converted into binary codes. When a photo is uploaded, the colours of each pixel are stored as numbers. When a song is streamed, the sound is represented as digital samples.

A key idea in this topic is that data representation is always a compromise between accuracy, storage space, and speed. More detailed data often requires more bits. A high-quality image may look better, but it also takes more memory and more time to send across a network.

Binary, bits, and bytes

Binary is the number system used by computers because electronic circuits have two stable states, often represented as on/off or high/low voltage. In binary, each position has a value that is a power of $2$. For example, the binary number $1011$ represents:

$$1 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 1 \times 2^0 = 8 + 0 + 2 + 1 = 11$$

So $1011_2 = 11_{10}$. Understanding binary is essential because all other data types are built from it.

Bits and bytes are used to measure storage capacity. For example, a text file may be $5\ \text{KB}$, while a video file may be $2\ \text{GB}$. The larger the amount of data, the more memory is needed. This matters in system architecture because memory, storage devices, and communication channels must all handle data efficiently.

A useful real-world analogy is digital lockers 🔒. A bit is like a single locker that can be either empty or full. A byte is a small row of $8$ lockers that can store a larger pattern. Computers use huge numbers of these “lockers” to represent all kinds of information.

Representing text, images, and sound

Different types of data are represented in different ways.

Text: Characters are represented using character encoding systems such as ASCII and Unicode. ASCII uses $7$ bits for basic English characters, allowing $2^7 = 128$ possible combinations. Unicode is much larger and can represent characters from many languages, symbols, and emojis. This is important because modern systems need to support global communication.

Images: A digital image is made of pixels, and each pixel stores colour values. In a simple black-and-white image, each pixel might be represented by one bit. In colour images, more bits are needed to represent red, green, and blue values. A common model is RGB, where each colour channel is stored numerically. More bits per pixel usually means better image quality but larger file size.

Sound: Sound is an analogue wave in the real world, but computers store sound digitally by sampling it at regular intervals. Each sample measures the amplitude of the wave. The sampling rate is measured in hertz, such as $44.1\ \text{kHz}$. A higher sampling rate captures more detail, but it also creates a larger file. Bit depth also affects quality: more bits per sample allow more precise values.

For example, a music app that stores audio at a higher sampling rate and bit depth can preserve more detail, which is useful for studio recordings 🎵. However, a lower-quality version may be chosen for faster streaming on a mobile network.

Number systems and conversion

IB Computer Science SL expects students to understand how numbers can be represented in binary and to move between number systems. Binary is the base-$2$ number system. Decimal is base-$10$, which humans use daily. Hexadecimal is base-$16$, and it is often used because it makes binary easier to read.

A hexadecimal digit represents $4$ bits. For example, the binary pattern $11110000$ can be grouped as $1111\ 0000$, which becomes $F0_{16}$. This is shorter and easier to interpret than a long binary string.

Let’s look at a decimal-to-binary example. To convert $13_{10}$ into binary, find powers of $2$ that add to $13$:

$$13 = 8 + 4 + 1 = 2^3 + 2^2 + 2^0$$

So $13_{10} = 1101_2$.

This skill matters in system fundamentals because addresses, colours, machine instructions, and memory values often rely on binary and hexadecimal. For example, memory locations may be shown in hexadecimal because it is compact and readable.

Accuracy, resolution, and file size

Data representation affects quality and efficiency. In images, resolution means the number of pixels used to create the picture. A $1920 \times 1080$ image contains more pixels than a $640 \times 480$ image, so it usually looks sharper. But it also needs more storage.

For sound, quality depends on sampling rate and bit depth. If the sampling rate is too low, the sound may lose detail. This is a form of information loss. In some systems, lossy compression is used to reduce file size, meaning some data is removed permanently. This is acceptable when exact reproduction is not essential, such as streaming music or video.

In contrast, lossless compression reduces file size without losing any original information. This is useful for text files, program files, and medical data, where accuracy is critical.

A good example is a smartphone camera 📱. If students takes a photo in high resolution, the file is larger but more detailed. If the photo is compressed for social media, it uploads faster but may lose some quality. This trade-off is central to data representation.

Data representation in system fundamentals

Data representation is not separate from the rest of the computer system. It connects directly to input, processing, storage, and output.

When a device receives input, it must convert the input into binary. The processor then works on that binary data. Memory stores it temporarily, and secondary storage keeps it for later use. Output devices turn the binary data back into something people can understand, such as a display image, printed page, or audio signal.

This means that data representation supports the entire computer system pipeline. Without it, there would be no way for hardware and software to communicate in a consistent format. For example, a spreadsheet application stores numbers using binary, calculates with them in the CPU, and displays the final answer on the screen.

Data representation also affects performance. Smaller files may transfer faster across a network and load more quickly from storage. But if the representation is too compressed or too simple, it may reduce quality or accuracy. This is why system designers choose formats carefully based on the task.

Why this matters in real life

Data representation is everywhere 🌍. Banks use encoded data for account numbers and transactions. Hospitals use digital images and patient records. GPS systems use coordinates and maps stored in binary. Streaming platforms use compressed audio and video to deliver content efficiently. Even online games depend on data representation to store graphics, player movement, and network messages.

If students understands data representation, it becomes easier to understand why file sizes differ, why some images are sharper than others, and why some programs need more storage and memory. It also helps explain why different devices or apps use different formats.

Conclusion

Data representation is the foundation of how computers handle information. Everything a computer uses must be converted into binary, whether it is text, numbers, images, or sound. In IB Computer Science SL, this topic helps students understand how systems store data, process it efficiently, and balance quality against file size and speed. Data representation is a core part of System Fundamentals because it links hardware, software, memory, storage, and communication into one working system.

Study Notes

A bit is a $0$ or $1$, and a byte is $8$ bits.
Computers use binary because electronic systems have two stable states.
Text is stored using character encoding systems such as ASCII and Unicode.
Images are made of pixels, and colour is stored numerically, often using RGB.
Sound is represented by sampling an analogue wave at regular intervals.
More bits usually mean greater accuracy and larger file size.
Resolution affects image detail, and sampling rate affects audio quality.
Hexadecimal is base-$16$ and is useful for reading binary more easily.
Lossless compression keeps all original data, while lossy compression removes some data.
Data representation connects directly to input, processing, memory, storage, and output in computer systems.
Understanding data representation helps explain real-world technology such as cameras, smartphones, streaming apps, and databases.