Course notes for HarvardX CS50 Introduction to Computer Science 2021.
Binary numbers use a base-2 numeral system where each place represents one bit.
16 | 8 | 4 | 2 | 1 |
---|---|---|---|---|
25 | 24 | 23 | 22 | 21 |
Bytes represent more complex data, like characters, by referencing standardized encodings. 01000001
is a
in ASCII which is an 8 bit encoding. Unicode is a subset of ASCII that uses one byte for the first 128 code points and up to four bytes for other characters.
Step by step process for solving a problem. Comprised of:
Other features
What questions rise from executing simple tasks?
Writing good code: correctness, design, and style.
Main article c (programming language)
An incremented value exceeds the available bits. 111
in binary plus one should equal 8, but without adding another bit to hold 1000
, the available bits become 000
.
Convert source code to machine code.
#
Tool number one is to utilize diagnostic print statements to display the state of the program.
Tool number two is to utilize a debugger. For example, the GDB: Gnu Debugger allows you to see what is going on ‘inside’ another program while it executes or what another program was doing at the moment it crashed.
Tool number three is to utilize a ‘rubber ducky’. Talking through a problem aloud, to a ‘rubber ducky’, can help make the issue apparent.
See c programming language data types.
Random access memory (RAM) is fast but volatile. It is purely electronic—requires electricity to store data—unlike long term memory which does not.
See c programming language arrays.
for i from 0 to n-1
if the i'th element is 50
return true
return false
if no items
return false
if number is middle item
return true
else if number < middle item
search left half
else if number > middle item
search right half
for i from 0 to n-1
find smallest item between i'th item and last item
swap smallest item with i'th item
repeat n-1 times
for i from 0 to n-2
if i'th and i+1 elements out of order
swap them
if no swaps
quit
if only one item
quit
else
sort left half of items
sort right half of items
merge sorted halves
Recursion is when a function calls itself. Useful when the structure of data is defined by referring to itself.
Hexadecimal is base-16; uses 16 digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F
.
4096 | 256 | 16 |
---|---|---|
163 | 162 | 161 |
Binary 11111111
(255) is equal to 0xFF
in hex.
The ambiguous prefix 0x
is added to hexadecimal numbers for clarity.
See c programming language pointers.
We can use pointers to get access to a block of dynamically-allocated memory at runtime. Allocated memory must be returned to the system.
malloc
returns a pointer.
// Statically obtain an integer
int x;
// Dynamically obtain an integer
int *px = malloc(sizeof(int));
// Get an integer from the user
int x = GetInt();
// Array of floats on the stack
float stack_array[x];
// Array of floats on the heap
float* heap_array = malloc(x * sizeof(float));
See data structure.
Arrays occupy contiguous bytes of memory.
Disadvantage is that arrays must be duplicated to grow in size, making them difficult to resize efficiently.
Linked lists are dynamic data structures comprised of nodes pointing to other nodes.
// The name `node` here is similar to a
// function prototype. It is required to
// use `node` inside of this typedef.
typedef struct _node {
int number;
struct _node *next;
} node;
Insertion running time is O(1). Disadvantage is the lack of random access and increased memory usage.
Binary search trees are recursive data structures where the height of the tree is log n.
typedef struct _node {
int number;
struct _node *left;
struct _node *right;
} node;
The running time of insert on a balanced binary search tree is equivalent to the number of steps it takes to find the location in which the new number belongs—O(log n).
Disadvantage is the rapid degeneration of balance.
Hash tables are a data structure that can be implemented as an array of linked lists dependent on a hash function. Hashing means to look at some input and produce some numeric output based on some characteristic of the input. A hash function should be deterministic and uniformly distribute data. Collisions need to be handled.
Disadvantage is increased memory usage.
Tries are a data structure that is a tree of nodes. Each node is an array of pointers to other nodes.
typedef struct _trie {
char university[20];
struct _trie* paths[10];
} trie;
Disadvantage is increased memory usage.
A queue is a first in first out (FIFO) data structure.
Methods:
A stack is a last in first out (LIFO) data structure.
Methods:
A dictionary is a set of associated key/value pairs.
See python (programming language).
See database.
A flat-file database is a database stored as an unstructured file. i.e. csv, tsv.
A relational database stores and provides access to data points that are related to one another represented as tables. Each row in the table is a record with a unique ID called the key. The columns of the table hold attributes of the data.
See sql.
A primary key uniquely identifies each row. A foreign key references the primary key of another table.
See Internet Protocol Suite, IP address, and DNS.
Hypertext Transfer Protocol (HTTP) is an application layer protocol in the internet protocol suite model for distributed, collaborative, hypermedia information systems.
Some supported commands:
GET
- Request information from a serverPOST
- Send information to a serverSee HTML, CSS, and JavaScript.
Building web applications with Flask.
A cookie is a packet of data stored in device memory that allows a website to recognise users when they return to a site often allowing access to different content or services.
Algorithms manage the exploration vs. exploitation trade-off. One such method is epsilon-greedy, where is a parameter controlling the amount of exploration vs. exploitation.
A genetic algorithm is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms.
Artificial neural networks are comprised of a node layers, containing an input layer, one or more hidden layers, and an output layer. Each node, or artificial neuron, connects to another and has an associated weight and threshold. If the output of any individual node is above the specified threshold value, that node is activated, sending data to the next layer of the network. Otherwise, no data is passed along to the next layer of the network.