Text, Images, and Sound Representation

students, every digital device stores information in bits, but people want to read words, see pictures, and hear audio 🎵. This lesson explains how computers turn human-friendly media into binary data and back again. Understanding this helps you explain how files work, why compression matters, and how system design affects storage, speed, and quality.

Learning goals

By the end of this lesson, students will be able to:

Explain how text, images, and sound are represented in binary.
Use correct terminology such as ASCII, Unicode, pixel, resolution, sample rate, and bit depth.
Apply basic IB Computer Science SL reasoning to file size and data representation.
Connect media representation to system performance, storage, and quality trade-offs.
Use examples to show how different representation choices affect real-world computing.

Text representation: turning characters into bits

Computers do not store letters directly. They store numbers, and those numbers are interpreted as characters. For example, the letter $A$ may be stored as a binary value using a character encoding. A character encoding is a rule that maps characters to numeric codes.

A common early encoding is ASCII, which stands for American Standard Code for Information Interchange. ASCII uses $7$ bits for each character, so it can represent $2^7 = 128$ different codes. That was enough for English letters, numbers, punctuation, and control characters, but not for all world languages.

Unicode is a much larger character set designed to represent characters from many writing systems, symbols, and emojis 🌍. Unicode assigns each character a unique code point. A common encoding for Unicode is UTF-8, which uses $1$ to $4$ bytes per character depending on the symbol. This makes it efficient for English text while still supporting almost all modern writing systems.

Example: storing a short message

If a message has $20$ characters and each character uses $1$ byte, then the text data size is:

$$20 \times 8 = 160 \text{ bits}$$

$$20 \times 1 = 20 \text{ bytes}$$

In real systems, text files may use more space because of metadata, formatting, or encoding overhead. A plain text file stores only characters, while a document file may also store fonts, spacing, and styles.

Why text representation matters

Text representation affects compatibility and storage. If two systems use different encodings, text can appear incorrect. This is why the same file might look fine on one device but show strange symbols on another. In system fundamentals, this links directly to data representation, interoperability, and file management.

Image representation: pixels, color, and resolution

A digital image is usually made of tiny squares called pixels. Each pixel stores color information. The more pixels an image has, the more detail it can show. This is called resolution, often written as width × height, such as $1920 \times 1080$.

If an image is $1920 \times 1080$, then the total number of pixels is:

$$1920 \times 1080 = 2{,}073{,}600 \text{ pixels}$$

Each pixel must store color data. In RGB color model, color is made from red, green, and blue values. If each channel uses $8$ bits, then each pixel uses:

$$8 + 8 + 8 = 24 \text{ bits per pixel}$$

That means a raw image of $1920 \times 1080$ pixels would need:

$$2{,}073{,}600 \times 24 = 49{,}766{,}400 \text{ bits}$$

or about:

$$\frac{49{,}766{,}400}{8} = 6{,}220{,}800 \text{ bytes}$$

which is about $6.22$ MB.

Example: why high-resolution images take more space

A phone photo with millions of pixels can look sharp, but it also uses a lot of storage. A small icon may only need a few hundred pixels, so it is much smaller. This is a trade-off: more detail usually means more file size.

Bitmap and vector images

Most raster or bitmap images store each pixel separately. They are great for photographs but can become blurry when enlarged because the pixels become visible.

Vector images store shapes using mathematical descriptions rather than pixels. For example, a circle can be stored using its center and radius. Vector graphics scale well without losing quality, so they are ideal for logos and diagrams.

This distinction is important in IB Computer Science SL because it shows how data representation affects quality, editing, and file size.

Sound representation: sampling the real world

Sound is a continuous wave in the real world, but computers need discrete values. To store sound digitally, a device samples the sound wave at regular intervals. Each sample records the amplitude of the wave at a specific moment.

Sample rate

Sample rate is the number of samples taken per second, measured in hertz. For example, a sample rate of $44{,}100\ \text{Hz}$ means $44{,}100$ samples per second. Higher sample rates usually capture sound more accurately, but they also increase file size 🎧.

Bit depth

Bit depth is the number of bits used for each sample. A $16$-bit sample can represent $2^{16} = 65{,}536$ possible values. A larger bit depth gives better dynamic range and more precise amplitude levels.

File size formula for uncompressed audio

A common formula for the size of uncompressed audio is:

$$\text{File size} = \text{sample rate} \times \text{bit depth} \times \text{number of channels} \times \text{duration}$$

If the result is in bits, divide by $8$ to get bytes.

Example: stereo audio

Suppose a $10$-second stereo clip uses a sample rate of $44{,}100\ \text{Hz}$ and a bit depth of $16$ bits.

$$44{,}100 \times 16 \times 2 \times 10 = 14{,}112{,}000 \text{ bits}$$

Convert to bytes:

$$\frac{14{,}112{,}000}{8} = 1{,}764{,}000 \text{ bytes}$$

That is about $1.76$ MB before compression.

Why sampling matters

If the sample rate is too low, important parts of the sound wave can be missed. This can cause distortion or loss of quality. In practice, audio systems choose a balance between quality and file size, depending on the purpose. A voice memo can use lower settings than a music studio recording.

Compression, quality, and practical choices

Media files can become very large, so compression is often used. Compression reduces file size by removing redundancy or less important data.

Lossless compression reduces file size without permanently losing information. Text files and some image formats use lossless compression so the original can be perfectly recovered.

Lossy compression removes data that people are less likely to notice. JPEG images and MP3 audio are common examples. This can greatly reduce file size, but repeated saving may reduce quality over time.

students, this is a key system fundamentals idea: representation choices affect storage, transmission speed, processing load, and user experience. A video call, for example, must compress sound and images quickly so data can travel over a network in real time.

Real-world connection

A school website uses text for instructions, images for diagrams, and sound for accessibility features.
A music app must store and stream audio efficiently so songs load quickly.
A medical image needs high detail, so compression choices must preserve important information.

These examples show that representation is not just theory. It directly affects how systems are designed and used.

Conclusion

Text, images, and sound are all stored as binary data, but each has different methods of representation. Text uses character encodings like ASCII and Unicode. Images use pixels, resolution, and color values such as RGB. Sound uses sampling, sample rate, and bit depth. These ideas connect to storage size, quality, compression, compatibility, and performance. Understanding them helps students explain how real computer systems work and why designers make certain choices.

Study Notes

Text is stored using character encodings, not as actual letters.
ASCII uses $7$ bits per character and supports $128$ codes.
Unicode supports many languages and symbols, including emojis.
UTF-8 uses $1$ to $4$ bytes per character.
Images are made of pixels, and resolution is the number of pixels in width and height.
RGB color commonly uses $24$ bits per pixel when each channel uses $8$ bits.
Bitmap images are pixel-based; vector images are mathematically described.
Sound is sampled from a continuous wave into discrete values.
Sample rate is measured in hertz, and bit depth controls the number of possible amplitude values.
Higher sample rate and bit depth usually mean better quality and larger files.
Uncompressed audio file size can be estimated with the formula $\text{sample rate} \times \text{bit depth} \times \text{channels} \times \text{duration}$.
Lossless compression preserves all data; lossy compression reduces size by removing less noticeable information.
Media representation choices affect storage, speed, quality, and compatibility across systems.
These topics are part of System Fundamentals because they show how data is represented and managed in computer systems.