C (programming language)

C is a general-purpose, imperative procedural programming language that supports structured programming. It was designed to be compiled to provide low-level access to memory and language constructs that map efficiently to machine instructions.

Also see notes on C Programming: A Modern Approach.

Creating a C program

There are four fundamental tasks in the creation of any C program.

  1. editing - creating and modifying source code (*.c)
  2. compiling - converts source code to machine code and detects and reports errors (*.o)
  3. linking - combines object modules and dependencies in one executable (a.out)
  4. executing - run the compiled program

Human readable source code needs to be compiled to machine code using a compiler like gcc or clang.

A linker combines the object code produced by the compiler with any additional code need to yield a complete executable program.

From the command line, run clang <filename>.c to generate a.out executable file; then ./a.out to execute the program. Including the -o flag writes output to file:

clang -o hello hello.c

The make utility utilizes structured information contained in a makefile in order to properly compile and link a program.

Program structure

  • preprocessor directives
  • functions & variables
  • statements & expressions
  • comments
#include <stdio.h>

int main(void) {
  // Print a message
  printf("hello, world!\n");
  return 0;
}

The main function

The main function is mandatory. It gets called automatically when the program is executed. main returns a status code; the value 0 indicates normal program termination. If there is no return statement at the end of the function, many compilers will produce a warning message.

Comments

Comments are ignored by the compiler.

/*
This is an example of
a multi-line comment
*/

// This is an example of a single-line comment

Preprocessor

The preprocessor recognizes and analyzes special statements before analysis of the program itself occurs. It provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control.

Preprocessor statements are identified by a pound sign, #, as the first non-space character on a line.

Directive Description
#define Substitutes a preprocessor macro
#include Inserts a particular header from another file
#undef Undefines a preprocessor macro
#ifdef Returns true if this macro is defined
#ifndef Returns true if this macro is not defined
#if Tests if a compile time condition is true
#else The alternative for #if
#elif #else and #if in one statement
#endif Ends preprocessor conditional
#error Prints error message on stderr
#pragma Issues special commands to the compiler, using a standardized method

Primitive data types

Integer types

Type Size Sign
char 8 bits -/0/+ or 0/+
unsigned char 8 bits 0/+
signed char 8 bits -/0/+
int 16 or 32 bits -/0/+
unsigned int 16 or 32 bits 0/+
short 16 bits -/0/+
unsigned short 16 bits 0/+
long 64 bits -/0/+
unsigned long 64 bits 0/+

Floating point types

Type Size Precision
float 32 bits 6 decimal places
double 64 bits 15 decimal places
long double 80 bits 19 decimal places

void type

The void type specifies that no value is available. Used as:

  • void function return
  • void function argument
  • void pointer

Type casting

Use the cast operator to convert one data type to another.

(type_name) expression

String data type

There is no such type as string. A string is implemented as a pointer to the first character in a sequence of characters requiring explicit termination. char* s is equal to string s where typedef char* string is the helper definition resulting in the latter. Since every string ends with a null terminating character, it is sufficient to point to the first address of a string in a pointer.

A string is an array of chars. The null character is a control character with the value zero. \0—literally 00000000—terminates the end of a string, and therefore the memory requirements for a null-terminated string is the number of bytes required by the total number of characters plus one byte for the null character.

char* string;

// Helper
typedef char* string;

// Helper usage
string s = "Hello, world!";

Variables

Variables are the names you give to computer memory locations which are used to store values in a program.

Variable declaration makes the variable's type and identifier known to the compiler in order to accept references to that identifier.

extern int i;

Variable definition allocates storage for the variable. Though you can declare a variable multiple times in your C program, it can be defined only once in a file, a function, or a block of code.

int i;

Variable initialization occurs when a value is assigned to a variable definition;

int i = 1;

Scope

Local variables can only be accessed within the functions in which they are created. They are passed by value in function calls, meaning only a copy of the variable's value is passed and not the variable itself.

Global variables can be accessed by any function in the program.

Statements

A statement is a command to be executed when the program runs. C requires that each statement end with a semicolon.

Expressions

Operators

Operators are functions that use a symbolic name to perform mathematical or logical functions.

Arithmetic

  • + add
  • - subtract
  • * multiply
  • / divide
  • % modulus
  • ++ increment
  • -- decrement

Logical

  • && logical AND
  • || logical OR
  • ! logical NOT

Relational

  • == equality
  • != inequality
  • < less than
  • > greater than
  • <= less than or equal to
  • >= greater than or equal to

Bitwise

  • & binary AND
  • | binary OR
  • ^ binary XOR
  • ~ binary one's compliment
  • << binary left shift
  • >> binary right shift

Assignment

  • = simple assignment
  • += add AND assignment
  • -= subtract AND assignment
  • *= multiply AND assignment
  • /= divide AND assignment
  • %= modulus AND assignment
  • <<= left shift AND assignment
  • >>= right shift AND assignment
  • &= bitwise AND assignment
  • ^= bitwise exclusive OR assignment
  • |= bitwise inclusive OR assignment

sizeof operator

The sizeof operator is a compile-time unary operator and used to compute the size of its operand. It returns the size of a variable.

printf("Size of int type: %lu\n", sizeof(int));

Operator Precedence

Precedence dictates the order of evaluation in an expression. If two operators have the same precedence, associativity rules are applied. However, anything enclosed in parentheses is executed first.

Visit Precedence and order of evaluation at Microsoft Docs for a complete table.

Control flow

Control flow statements break up a program's flow of execution by employing decision making, looping, and branching, enabling the program to conditionally execute particular blocks of code.

Conditional branching statements

A conditional branching statement requires the programmer to specify a condition(s) to be evaluated by the program. Different statements are executed dependent on whether the condition(s) is met or not.

The if statement:

if (boolean-expression) {
  // code executed if bool-exp evaluates to true
}

The if/else statement:

if (boolean-expression) {
  // code executed if bool-exp evaluates to true
} else {
  // code executed if bool-exp evaluates to false
}

The switch statement:

int x = 2;
switch(x) {
  case 1:
    // code executed if case evaluates to true
    break;
  case 2:
    // code executed if case evaluates to true
    break;
  default:
    // code executed if no prior break occurs
}

The ternary operator:

// condition ? expression 1 : expression 2;

int x = (expr) ? 5 : 6;
// `x` is assigned a value of 5 if
// `expr` evaluates to true, else, 6

Unconditional branching statements

An unconditional branching statement doesn't check any condition before executing the code.

  • The break statement can be used to jump out of a loop.
  • The continue statement breaks one iteration of a loop.
  • The return statement ends the execution of a function and returns control to the calling function.

Looping statements

A looping statement allows statement(s) to be executed multiple times.

The for loop repeats a specified number of times:

for (init; condition; increment) {
  statement(s)
}

The while loop repeats while expression evaluates to true:

while(condition) {
   statement(s);
}

The do while loop repeats once then checks the expression (guaranteed to run at least once):

do {
   statement(s);
} while( condition );

Derived & user-defined data types

Derived data types add functionality and relationships to primitive data types.

  • Functions
  • Arrays
  • Pointers
  • References

User-defined data types:

  • Structures
  • Unions
  • Enum
  • Typedef

Functions

A function declaration—also known as a function prototype—tells the compiler about a function name, its return type, and its parameters. Parameters require defined datatypes.

return_type function_name( parameter_list );

A function definition specifies the body of the function in addition to the function name, return type, and parameters. Parameters require defined datatypes.

return_type function_name( parameter_list ) {
   body of the function
}

Functions are called by their names and any required arguments. Arguments do not require datatypes.

function_name( argument_list );

The return statement provides the means of exiting from a function and returning data when necessary.

Arrays

An array is a data container that can store a sequential collections of elements of the same type. The name of an array is a pointer to the first item in the array.

Declare an array by specifying the type and number of elements required.

data_type array_name[array_size];

Example declaration:

int scores[4]

There are two ways to initialize an array. All array elements that are not initialized explicitly are zero-initialized.

  1. Static array initialization - Initializes all elements of array during its declaration.
// The number of elements cannot exceed the
// number declared in square brackets
int scores[4] = { 90, 96, 89, 94 };

// Omitting the size creates an array big enough
// to hold only the number of initialized elements
int scores[] = { 90, 96, 89, 94 };
  1. Dynamic array initialization - The declared array is initialized some time later during execution of program.
int scores[4]
scores[0] = 90;
scores[1] = 96;
scores[2] = 96;
scores[3] = 96;

Access array elements by index number.

scores[0] // 90

Pointers

In programming languages, indirection is the ability to reference something using a name, reference, or container, instead of the value itself. The most common form of indirection is the act of manipulating the value through its memory address.

A pointer is a variable whose value is the address (location in memory) of another variable. The type describes the data located at that memory address.

  • & - "address of" operator
  • * - indirection operator
    • before a variable declaration is a "pointer to" operator
    • before a variable expression is a "go to address" operator
// Variable declaration
int var = 10;

// Pointer declaration
int *pvar;

// Store the address of `var` in `pvar`
pvar = &var;

// Use the address of operator `&` to access
// the address of a variable
printf("Address of `var`: %p\n", &var);

// A pointer's value is an address
printf("Address stored in `pvar`: %p\n", pvar);

// Use the indirection operator `*` to access
// the value of the variable pointed to
printf("Value of `*pvar`: %d\n", *pvar);

It is considered best practice to set uninitialized pointers to NULL which is the equivalent of zero for a pointer (requires stddef.h).

Dereferencing

When the indirection operator is used with a pointer variable, it is known as dereferencing a pointer. When we dereference a pointer, the value of the variable pointed by this pointer will be returned.

  • It can be used to access or manipulate the data stored at the memory location which the pointer points to.
  • Any operation applied to the dereferenced pointer will directly affect the value of the variable that it points to.

Pointer arithmetic

int n[] = {0, 1, 2, 3};
int *p = n;

// Since `*p` is the first value in the array i.e. n[0]
*p; // 0

// You can add to the address and apply the indirection operator to access subsequent values
*(p + 1); // 1

// Pointer arithmetic changes the address of the pointer to point to a new value
p++;
*p; // 1

Dynamic memory allocation

malloc

The C library function malloc allocates the requested memory and returns a pointer to it.

The following example dynamically allocates enough memory for 25 integers, checks that is was allocated successfully, and then releases the memory.

// Allocate memory
int *p = NULL;
p = (int *)malloc(25 * sizeof(int));

// Check for successful memory allocation
if (!p) {
  exit(1);
};

// Release allocated memory
free(p);
p = NULL;

calloc

The C library function calloc allocates the requested memory and returns a pointer to it. The difference between malloc and calloc is that calloc sets the allocated memory to zero.

The following memory allocation is equivalent to malloc in the previous example with the additional outcome of zero initialized values.

int *p = NULL;
p = (int *)calloc(25, sizeof(int));

realloc

The C library function realloc attempts to resize the memory block previously allocated with a call to malloc or calloc. realloc preserves the contents of the original memory block.

To reallocate the memory pointed to by the pointer p in the previous examples:

p = (int *)realloc(p, 50 * sizeof(int));

Structures

A structure is a user defined data type that can store data items of differing types.

Define a structure:

struct [structure tag] {
   member definition;
   ...
} [structure variables];

Structure definition example:

struct Book {
  int id;
  char title[50];
  char author[50];
};

Structure variables

When a struct type is declared, no storage or memory is allocated. To allocate memory of a given structure type, we need to create variables.

struct Book {
  // code
} book1, book2;

or

struct Book {
  // code
};

struct Book book1, book2;

Accessing members of a structure

Refer to a member of a structure by with the "dot" (.) operator, used for direct member selection via object name.

object.member

You can also use the indirect member selection operator (->)—it has precedence just lower to dot (.) operator. It is used to access the members indirectly with the help of pointers.

object->member

Initializing a structure

struct Date {
  int day;
  int month;
  int year;
};

// 1. In order
struct Date today = { 14, 10, 2022 };

// 2. With member names
struct Date today = { .day = 14, .month = 10, .year = 2022 };

// 3. Compound literal
today = (struct Date) { 14, 10, 2022 };

Enum

Enumeration is a user-defined data type used to assign names to integral constants.

enum primary_color{red, yellow, blue};

int main(void) {
  enum primary_color my_color = blue;
  printf("%d\n", my_color);
}

// Output:
// 2

Typedef

Typedef allows for assigning a name to existing or user defined data types.

// Existing
typedef char* string;

// User defined
typedef struct {
  string name;
  string number;
} Person;

Command-line arguments

int main(int argc, char* argv[])
  • int argc - argument count
  • char* argv[] or string argv[] (with helper) - argument vector

The first typed argument is argv[1] not argv[0] which is always the program's name.

C Standard library

Resources