“char *s” vs “char s[]”

Program 1: string_001.c

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

void main()
{
    char *str = "string";
    str[1] = 'z';  // could be also written as *str = 'z'
    printf("%s\n", str);
}

Program 2: string_002.c

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

void main()
{
    char str[] = "string";
    str[0] = 'z';
    printf("%s\n", str);
}

Explanation –

In Program 1, "string" is stored in a Read Only memory, and *str points to 1st character of "string". Hence changing the first character causes segmentation fault. This is done to prevent accidentally changing Read Only memory.
In the Program 2, "string" is copied from read only memory to str[] array. Now changing the character is permitted.

Let’s deep dive through Assmebly:

You can generate ASM files –
gcc -S -fverbose-asm string_001.c
gcc -S -fverbose-asm string_002.c


Output:
string_001.s
string_002.s

Now do diff –

	.file	"string_001.c"
	.text
	.section	.rodata
.LC0:
	.string	"string"
	.text
	.globl	main
	.type	main, @function

.rodata – Read only Data Section.
LC0 – Local label. Here it represents the string constant “string”.

Now see the diff of assembly code –

Program 1:
leaq – Load Effective Address (Quad)
leaq S, D (D <- &S)
Loads the address of S in D, not the contents
leaq .LC0(%rip), %rax # %rip - relative addressing, loading a quad into rax
Here address of read only data “string” is copied to rax

Now –

# string_001.c:8: str[0] = 'z';
	movq	-8(%rbp), %rax	# str, tmp83
	movb	$122, (%rax)	#, *str_1

$122 – ASCII lowercase z moves to rax. Now rax has the read only memory pointer. Which causes the segmentation fault.

Leave a comment