Browse Source

Archive gcc bootstrap

This commit was never properly finalized, might not even work
master
mid-kid 2 years ago
parent
commit
2485b5c6f3
  1. 61
      gcc/README.md
  2. 12
      gcc/build.sh
  3. 4
      gcc/build_binaries.sh
  4. 27
      gcc/build_bootstrap.sh
  5. 9
      gcc/build_cross.sh
  6. 2
      gcc/notes/gentoo/gentoo_notes.txt

61
gcc/README.md

@ -0,0 +1,61 @@
Deprecation
===========
Just use [live-bootstrap](https://github.com/fosslinux/live-bootstrap), they got a lot more figured out.
This is just historical interest for me now.
GCC Bootstrap
=============
This is a collection of notes and utilities related to bootstrapping the GNU Compiler Collection, as well as necessary GNU utilities, from as little as possible, with the goal of bootstrapping any GNU/Linux distribution from these, such as Linux From Scratch or Gentoo.
Background
----------
C compilers, being as old, complex and utterly foundational as they are, don't have nearly as clear and historically documented of a bootstrap path as newer languages. How exactly they came to be has been mostly lost to history, and probably because of how many small iterative steps were taken before we reached a consensus on what the language would be.
Nowadays, you'd think you can bootstrap C from simply an assembler, and while you _can_, most assemblers are written in C for the sake of architecture-independency, and you also need an environment to run this assembler in, to create and manage files, which is often also written in C. To fully bootstrap, you would have to write a kernel, assembler and compiler from scratch, all of which (especially the kernel) would be tied to a specific set of hardware, which few other people would have. That's rather inconvenient.
However, there's a more convenient way to go about this, that satisfies me, mostly. That is, by simply reducing the size of the binaries required to build a "full" C compiler as much as possible. This makes it possible to run these on existing kernels and systems, while making the binaries as inspectable, simple and hand-writeable as possible, so that one day there _could_ be a computer that bootstraps itself from hand-written machine code.
Enter [GNU Mes](https://www.gnu.org/software/mes/), a very small Scheme implementation (mes) written in C, a C library (mes-libc), and a C compiler (mescc) written in said Scheme. The mes binary comes in at around 108kB, and is fully self-hosting, requiring only a linux(-compatible) kernel to run, and an x86 processor.
This compiler has already been used to successfully build a full GNU/Linux distribution, as described in [a Guix recipe](https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/commencement.scm), but this source file is rather useless for anyone not using Guix, and the bootstrap process, being as complex as it is, doesn't provide any explanations for what is being done, does a lot of things specific to how Guix works, and generally isn't very readable to anyone not familiar with functional package managers and/or scheme.
I seek to "fix" this, by documenting the exact manual steps involved in this bootstrap, providing a regular shell script that can be used by _anyone_, and hopefully help make "Linux From Scratch" just that tiny bit more "From Scratch" than it already is.
Goals
-----
For sanity, this bootstrap's only goal is to provide just enough to build a modern GCC without any intermediate compiler/tool version hops.
A note about differences from Guix
----------------------------------
While I've tried to keep most of the bootstrap process intact compared to Guix, a lot of steps have been changed to either simplify the instructions, remove things that are unnecessary since this isn't Guix, and simplify the functionality required from the bootstrap utilities. It's hard to describe the exact changes, because there's a lot. The main thing that remains unchanged is (most of) the software versions, and the general bootstrap path.
Every step in the `build_bootstrap.sh` script has a comment with the name of the equivalent Guix package definition. The approximate revision these scripts are based on is [this](https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/commencement.scm?id=b85863f7ce99d05205e57358b36ff50656cca08b).
Future goals
------------
While I consider this bootstrap complete as is, this isn't the end of the road for smaller bootstrap binaries. As part of the [Bootstrappable](http://bootstrappable.org/) project, people are working [bootstrapping Mes with M2-Planet](https://github.com/oriansj/mes-m2), which in turn can be built with [a sub-kB hex assembler](https://github.com/oriansj/mescc-tools-seed), but it's not yet clear when or how this will be finished, as it's all still work-in-progress.
Additionally, while in the process of reimplementing the Guix bootstrap process I took care to simplify some things, and the requirements upon the functionality of some tools, there's still room for improvement, and Guix itself might also rewrite or simplify their process in the future, which might be passed down to here. I don't know.
In any case, anything that can be changed to reduce the amount of steps in the bootstrap, reduce the amount of patches/hacks necessary or reduce the binary footprint is welcome.
For more tangible goals, here's a small todo:
- Figure out if the coreutils precursors, filutils, textutils and sh-utils, can be built any earlier to further reduce reliance on busybox.
- Revisit glibc-2.16.0, as the build breaks(?) when rebuilt with gcc-4.9.4, and the patches are mildly ugly.
Note on busybox
---------------
While busybox is used to provide the initial toolset, because of its small size as a statically-linked binary, it isn't rebuilt during the bootstrap. While it would greatly simplify the instructions by building a ton of auxiliary tools at once, I haven't been able to find a single version that builds with gcc-2.95.2/glibc-2.2.5, so the GNU tools are used during the build, instead.

12
gcc/build.sh

@ -5,12 +5,14 @@ set -e
#qemu=qemu-i386 #qemu=qemu-i386
qemu= qemu=
NPROC="${NPROC:-$(nproc)}"
rm -rf system rm -rf system
mkdir -p builds mkdir -p builds
# Stage 1: Build matching mes and busybox binaries # Stage 1: Build matching mes and busybox binaries
./build_binaries.sh NPROC="$NPROC" ./build_binaries.sh
mv system/binaries.tar.gz builds/ # Will be regenerated later, to match. mv system/binaries.tar.gz builds/
# Stage 2: Bootstrap system from these # Stage 2: Bootstrap system from these
mkdir -p system/sources/ mkdir -p system/sources/
@ -19,13 +21,13 @@ cp build_bootstrap.sh system/sources/
mkdir system/dev system/tmp mkdir system/dev system/tmp
mknod system/dev/null c 1 3 mknod system/dev/null c 1 3
mknod system/dev/tty c 5 0 mknod system/dev/tty c 5 0
$qemu system/bin/busybox chroot system /bin/busybox env -i NPROC="$(nproc)" /bin/busybox sh /sources/build_bootstrap.sh $qemu system/bin/busybox chroot system /bin/busybox env -i NPROC="$NPROC" /bin/busybox sh /sources/build_bootstrap.sh
mv system/bootstrap.tar.gz builds/ mv system/bootstrap.tar.gz builds/
# Stage 2.5 (optional): Rebuild bootstrap binaries to have a "clean" archive # Stage 2.5 (optional): Rebuild bootstrap binaries to have a "clean" archive
if [ "$1" = double_bootstrap ]; then if [ "$1" = double_bootstrap ]; then
cp build_binaries.sh binaries.sha1 busybox-config system/ cp build_binaries.sh binaries.sha1 busybox-config system/
$qemu system/bootstrap/bin/chroot system /bootstrap/bin/env -i NPROC="$(nproc)" PATH=/bootstrap/bin /bootstrap/bin/sh /build_binaries.sh $qemu system/bootstrap/bin/chroot system /bootstrap/bin/env -i NPROC="$NPROC" PATH=/bootstrap/bin /bootstrap/bin/sh /build_binaries.sh
mv system/system/binaries.tar.gz builds/ mv system/system/binaries.tar.gz builds/
rm system/build_binaries.sh system/binaries.sha1 system/busybox-config rm system/build_binaries.sh system/binaries.sha1 system/busybox-config
rm -rf system/system/ rm -rf system/system/
@ -33,5 +35,5 @@ fi
# Stage 3: Cross-compile system for x86_64 # Stage 3: Cross-compile system for x86_64
cp build_cross.sh system/sources cp build_cross.sh system/sources
$qemu system/bootstrap/bin/chroot system /bootstrap/bin/env -i NPROC="$(nproc)" /bootstrap/bin/sh /sources/build_cross.sh $qemu system/bootstrap/bin/chroot system /bootstrap/bin/env -i NPROC="$NPROC" /bootstrap/bin/sh /sources/build_cross.sh
mv system/system/cross.tar.gz builds/ mv system/system/cross.tar.gz builds/

4
gcc/build_binaries.sh

@ -37,7 +37,7 @@ tar xf "$dir_sources/mes-$version_mes.tar.gz"
# First, we build a native mes, "mes-gcc". # First, we build a native mes, "mes-gcc".
# This allows us to cross-build everything on systems that mes doesn't support. # This allows us to cross-build everything on systems that mes doesn't support.
gcc -O2 -std=gnu99 -w -lrt -o mes-gcc -DMES_VERSION='""' -DSYSTEM_LIBC=1 -Iinclude \ gcc -w -O2 -std=gnu99 -fcommon -lrt -o mes-gcc -DMES_VERSION='""' -DSYSTEM_LIBC=1 -Iinclude \
lib/mes/eputs.c \ lib/mes/eputs.c \
lib/mes/fdgetc.c \ lib/mes/fdgetc.c \
lib/mes/fdputc.c \ lib/mes/fdputc.c \
@ -117,7 +117,7 @@ tar xf "$dir_sources/busybox-$version_busybox.tar.bz2"
# GCC-4.6.4's results aren't reproducible across machines for some reason, # GCC-4.6.4's results aren't reproducible across machines for some reason,
# so this bootstrap uses GCC-4.9.4. This increases the build time a little. # so this bootstrap uses GCC-4.9.4. This increases the build time a little.
( cd binutils ( cd binutils
CFLAGS='-O2 -w' ./configure \ CFLAGS='-O2 -w -fcommon' ./configure \
--target=i686-bootstrap-linux-gnu \ --target=i686-bootstrap-linux-gnu \
--prefix="$prefix" \ --prefix="$prefix" \
--with-sysroot="$prefix" \ --with-sysroot="$prefix" \

27
gcc/build_bootstrap.sh

@ -411,6 +411,9 @@ bzcat binutils-2.14.tar.bz2 | tar x
bfd/configure > /tmp/sed; mv /tmp/sed bfd/configure bfd/configure > /tmp/sed; mv /tmp/sed bfd/configure
chmod +x configure bfd/configure chmod +x configure bfd/configure
# Force deterministic AR output
patch -p1 -i ../binutils-2.14-force-deterministic.patch
CC='tcc -D__GLIBC_MINOR__=6' AR='tcc -ar' ./configure \ CC='tcc -D__GLIBC_MINOR__=6' AR='tcc -ar' ./configure \
--host=i686-pc-linux-gnu \ --host=i686-pc-linux-gnu \
--prefix=/gcc2 \ --prefix=/gcc2 \
@ -508,6 +511,9 @@ bzcat binutils-2.14.tar.bz2 | tar x
bfd/configure > /tmp/sed; mv /tmp/sed bfd/configure bfd/configure > /tmp/sed; mv /tmp/sed bfd/configure
chmod +x configure bfd/configure chmod +x configure bfd/configure
# Force deterministic AR output
patch -p1 -i ../binutils-2.14-force-deterministic.patch
./configure \ ./configure \
--host=i686-pc-linux-gnu \ --host=i686-pc-linux-gnu \
--prefix=/gcc2 \ --prefix=/gcc2 \
@ -710,6 +716,9 @@ rm -rf /bootstrap
rm -rf binutils-2.20.1 rm -rf binutils-2.20.1
tar jxf binutils-2.20.1a.tar.bz2 tar jxf binutils-2.20.1a.tar.bz2
( cd binutils-2.20.1 ( cd binutils-2.20.1
# Force deterministic AR output
patch -p1 -i ../binutils-2.20.1-force-deterministic.patch
./configure \ ./configure \
--build=i686-pc-linux-gnu \ --build=i686-pc-linux-gnu \
--prefix=/bootstrap \ --prefix=/bootstrap \
@ -732,10 +741,16 @@ tar jxf glibc-2.16.0.tar.bz2
( cd glibc-2.16.0 ( cd glibc-2.16.0
patch -p1 -i ../glibc-boot-2.16.0.patch patch -p1 -i ../glibc-boot-2.16.0.patch
# Fix hardcode of /bin/pwd
sed -i -e 's@/bin/pwd@pwd@g' configure
# Make build deterministic
sed -i -e 's/__DATE__//g' -e 's/__TIME__//g' nscd/nscd_stat.c
# This can't be rebuilt with the final gcc and glibc, for some reason # This can't be rebuilt with the final gcc and glibc, for some reason
# Possibly the configure flags aren't suitable? # Possibly the configure flags aren't suitable?
mkdir build && cd build mkdir build && cd build
CC='/gcc46/bin/gcc -DBOOTSTRAP_GLIBC=1 -L/gcc2/lib' ../configure \ CC='/gcc46/bin/gcc -L/gcc2/lib -DBOOTSTRAP_GLIBC=1' ../configure \
--build=i686-pc-linux-gnu \ --build=i686-pc-linux-gnu \
--prefix=/bootstrap \ --prefix=/bootstrap \
--with-headers=/bootstrap/include \ --with-headers=/bootstrap/include \
@ -769,6 +784,13 @@ tar jxf gcc-4.9.4.tar.bz2
tar zxf ../mpc-1.0.3.tar.gz tar zxf ../mpc-1.0.3.tar.gz
mv mpc-1.0.3 mpc mv mpc-1.0.3 mpc
# Build deterministic archives
#sed -i -e 's/$AR cru/$AR crD/' mpc/configure
#sed -i -e '/^ARFLAGS =/s/cru/crD/' \
# zlib/Makefile.in \
# libcpp/Makefile.in \
# libdecnumber/Makefile.in
# The previous setup to set the library/include path with --with-sysroot # The previous setup to set the library/include path with --with-sysroot
# doesn't work when you throw dynamic linking into the mix and you're not # doesn't work when you throw dynamic linking into the mix and you're not
# purely cross-compiling (we want to run resulting binaries as-is). # purely cross-compiling (we want to run resulting binaries as-is).
@ -829,6 +851,9 @@ export PATH=/bootstrap/bin:/bin
rm -rf findutils-4.6.0 rm -rf findutils-4.6.0
tar zxf findutils-4.6.0.tar.gz tar zxf findutils-4.6.0.tar.gz
( cd findutils-4.6.0 ( cd findutils-4.6.0
# Build deterministic archives
#sed -i -e 's/$AR cru/$AR crD/' configure
./configure \ ./configure \
--build=i686-pc-linux-gnu \ --build=i686-pc-linux-gnu \
--disable-nls --disable-nls

9
gcc/build_cross.sh

@ -10,7 +10,7 @@ set -e
# To homogenize build instructions across both multilib and non-multilib # To homogenize build instructions across both multilib and non-multilib
# installs, and because some applications require heavy patches to install # installs, and because some applications require heavy patches to install
# in alternate libdirs (cough cough python), the lib directory will contain # in alternate libdirs (cough cough python), the lib directory will contain
# native libraries, while lib32 whill contain 32-bit libraries. # native libraries, while lib32 will contain 32-bit libraries.
# GCC will be patched slightly, and configured to achieve this, as by default # GCC will be patched slightly, and configured to achieve this, as by default
# it uses lib64 and lib. # it uses lib64 and lib.
@ -47,6 +47,9 @@ rm -rf linux-4.14
rm -rf binutils-2.20.1 rm -rf binutils-2.20.1
tar jxf binutils-2.20.1a.tar.bz2 tar jxf binutils-2.20.1a.tar.bz2
( cd binutils-2.20.1 ( cd binutils-2.20.1
# Force deterministic AR output
patch -p1 -i ../binutils-2.20.1-force-deterministic.patch
./configure \ ./configure \
--build=i686-pc-linux-gnu \ --build=i686-pc-linux-gnu \
--target=x86_64-pc-linux-gnu \ --target=x86_64-pc-linux-gnu \
@ -144,8 +147,8 @@ tar jxf glibc-2.16.0.tar.bz2
# Do the whole gcc/glibc song and dance... # Do the whole gcc/glibc song and dance...
# DESTDIR set to a different dir since glibc makefile breaks otherwise... # DESTDIR set to a different dir since glibc makefile breaks otherwise...
make -C glibc-2.16.0/build DESTDIR=/system csu/subdir_install make -C glibc-2.16.0/build DESTDIR=/system csu/subdir_install # TODO: Try install-lib?
make -C glibc-2.16.0/build32 DESTDIR=/system csu/subdir_install make -C glibc-2.16.0/build32 DESTDIR=/system csu/subdir_install # TODO: Try install-lib?
make -C glibc-2.16.0/build DESTDIR=/system install-bootstrap-headers=yes install-headers make -C glibc-2.16.0/build DESTDIR=/system install-bootstrap-headers=yes install-headers
touch /system/bootstrap/include/gnu/stubs.h touch /system/bootstrap/include/gnu/stubs.h
x86_64-pc-linux-gnu-gcc -nostdlib -nostartfiles -shared -x c /dev/null -o /system/bootstrap/lib/libc.so x86_64-pc-linux-gnu-gcc -nostdlib -nostartfiles -shared -x c /dev/null -o /system/bootstrap/lib/libc.so

2
gcc/notes/gentoo/gentoo_notes.txt

@ -165,8 +165,6 @@ emerge -be @system
To install everything into a clean root: To install everything into a clean root:
USE=build emerge --root /final sys-apps/baselayout USE=build emerge --root /final sys-apps/baselayout
emerge --root /final -K -j$(nproc) @system emerge --root /final -K -j$(nproc) @system
#emerge --root /final --sysroot /final -K --with-bdeps=y --root-deps -j$(nproc) @system
TODO: How to forcefully install _everything_
Now you're essentially done. You can move /final (as well as /var/db/repos, Now you're essentially done. You can move /final (as well as /var/db/repos,
/var/cache/distfiles and /var/cache/binpkgs) to a proper disk and start using /var/cache/distfiles and /var/cache/binpkgs) to a proper disk and start using

Loading…
Cancel
Save