Skip Navigation

the most important question i have: can i program a whole operating system based on UNIX using ONLY python???

yes, not a unix os but rather unix-like, and i want to program all of it on python, is that possible?? even the kernel, i want it all python. i know most kernels use c++ or c* but maybe python has a library to turn c* into python?? i'm still sort of a beginner but thanks and i would appreciate the answers

23 comments
  • As it happens, this is strikingly similar to an interview question I sometimes ask: what parts of a multitasking OS cannot be written wholly in C. As one might expect, the question is intentionally open-ended so as to query a candidate's understanding of the capabilities and limitations of the C language. Your question asks about Python, but I posit that some OS requirement which a low-level language like C cannot accomplish would be equally intractable for Python.

    Cutting straight to the chase, C is insufficient for initializing the stack pointer. Sure, C itself might not technically require a working stack, but a multitasking operating system written in C must have a stack by the time it starts running user code. So most will do that initialization much earlier, so that the OS's startup functions can utilize the stack.

    Thjs is normay done by the bootloader code, which is typically written in assembly and runs when the CPU is taken out of reset, and then will jump into the OS's C code. The C functions will allocate local variables on the stack, and everything will work just fine, even rewriting the stack pointer using intrinsics to cause a context switch (although this code is often -- but not always -- written in assembly too).

    The crux of the issue is that the initial value of the stack pointer cannot be set using C code. Some hardware like the Cortex M0 family will initialize the stack pointer register by copying the value from 0x00 in program memory, but that doesn't change the fact that C cannot set the stack pointer on its own, because invoking a C function may require a working stack in the first place.

    In Python, I think it would be much the same: how could Python itself initialize the stack pointer necessary to start running Python code? You would need a hardware mechanism like with the Cortex M0 to overcome this same problem.

    The reason the Cortex M0 added that feature is precisely to enable developers to never be forced to write assembly for that architecture. They can if they want to, but the architecture was designed to be developed with C exclusively, including interrupt handlers.

    If you have hardware that natively executes Python bytecode, then your OS could work. But for x86 platforms or most other targets, I don't think an all-Python, no-assembly OS is possible.

  • The essence of your answers is "yes, but...". And the "but" is mostly about how slow Python is in contexts that need to be astonishingly fast.

    It depends how complex the hardware is and how much time we're willing to waste.

    Technically, when I deploy a Python program to a BBC Microbit, that's (more or less) what is happening. Pure Python code is making every decision, and is interacting directly with all available hardware.

    We could still argue semantics - virtually no (modern) computer exists that isn't running at least one tiny binary compatibility driver written in C.

    I believe the compiled C binary on a BBC Microbit to bootstrap a pure Python OS is incredibly small, but my best guess is that it's still present. The C library for Microbit needed to exist for other languages to use, and Python likes calling C binaries. So I don't imagine anyone has recreated it in pure Python for fun (and slower results).

    (Edit: As others have pointed out, I'm talking about MicroPython, which is, itself written in C. The Microbit is so simple it might not use MicroPython, but I can't imagine the BBC Microbit team bothered to reinvent the wheel for this.)

    Of course, if you don't mind that the lowest level code has got to be binary, and very few people are crazy enough to create that code with Python, then...

    It begs another interesting question: Just how much of an OS can we get away with writing in Python.

    And that question is answered both by RedHat Linux and Debian Linux - and the answer is that both are built with an awful lot of Python.

    In contrast, Android is mostly Java with lots of C a C Linux kernel. Windows is mostly C# and lots of C. iOS is mostly Objective C and lots of C.

    You can have an OS built with almost any language you want, as long as you also want parts of it built in C. (Edit: This is meant to amuse you, not be guidance for what is possible. Today, we love our C code. C didn't always exist, and might someday no longer be our favorite hardware driving language.)

    An interesting current development is discussion around rebuilding parts of the Linux Kernel with Rust, which can run just as fast as C. This would effectively cause RedHat, Debian and Android to replace some of their C code with Rust. To date, there's been a lot of interest and discussion and not a lot of (any?) actual funding or work completed.

  • I would not recommend this as an exercise for a beginner, but RPython is a subset of Python with a C backend; it is used as the basis of PyPy (an implementation of Python), so it may be possible to use it to implement the low-level parts which then can be used to bootstrap a full Python virtual machine.

    • In short: If you'd like to learn more, come visit #pypy on Libera IRC. It's an interesting discussion topic, particularly if we want standard-library imports like math, sys, or json to work.

      RPython is not capable of translating to bare metal today; it depends on libc and libffi for many features even when not producing JIT compilers. It's also intended to operate on a layer of syscalls: rather than directly instructing hardware, it wants to make fairly plain calls, perhaps via FFI, passing ordinary low-level values. So, any OS developer would first have to figure out how to get RPython to emit code that doesn't require runtime support, and also write out the low-level architecture-specific hardware-access routines.

      That said, RPython is designed to translate interpreters, and fundamentally it thinks an interpreter is any function with a while-loop, so a typical OS would be a fairly good fit in terms of architecture. RPython knows the difference between high-level garbage-collected objects and low-level machine-compatible values; GC would be available and most code would be written in a statically-typable dialect of Python 2.7 that tastes like Java or OCaml.

      The OS would be the hard part. RPython admits the same compositional flexibility as standard Python, so it should be possible to hack PyPy into something that can be composed with other RPython codebases. This wouldn't be trivial, though; PyPy in particular is tightly glued to RPython since they are developed together in a single repository, and it wasn't intended for reuse from the RPython side.

      If all of that sounds daunting, and what you would like to do instead is take an existing kernel or shell with C linkage and ELF support, and extend it arbitrarily using Python code, then PyPy can help you in that direction too. Compile a libpypy and embed PyPy against your kernel, and you can then run arbitrary Python code in a fairly nice environment which supports Python-first development. Warning: while the high-level parts of this might be nice, like Python's built-in REPL tools, the low-level parts could be very nasty since this embedding interface is old and rotting, to say nothing of actually getting bare-metal code that doesn't make syscalls.

  • Warning: talking out of my butt a bit so take with a grain of salt.

    I wonder if you could look at micropython. You could implement a unix like world on top of micropython then use micropython as the layer where a normal os would be.

    It would be miserable and likely impossible to be fully unix compliant but could be a fun thing to play with. I would be amazed if it ever somehow could run native unix binaries.

  • To program an operating system, you need deep knowledge how internals work. The language you are using needs low level access. And I don't think a garbage collected language is a good fit for an operating system either. Especially an interpreted language like Python requires an interpreter to run under. And Python is not the fastest language either, which is fatal for a low level os functionality.

    What do you expect from turning C code into Python code? Python does not have low level access, it requires C (even Rust requires some C functionality). I don't think it is even possible to write an OS purely in Python.

  • I'm not going to tell you you shouldn't do that, I think everybody else has done enough telling others what to do. I'll try to focus more on what you'd need to accomplish and why what you're asking hasn't been done.

    Building an OS involves a lot of complex work using very low level calls. The easiest way to think about it, IMO, is that whatever language you use needs to be able to communicate directly with the hardware without any abstraction between the code and the hardware after it's compiled.

    Basic Python, out of the box, requires multiple levels of abstraction to run.

    (I'm simplifying here) You write code which is run through an interpreter. The interpreter is a compiled application that translates Python into code the operating system can understand. Then the operating system translates that to calls the hardware can understand.

    In that process, the python code is translated to byte code, assembly, and machine code. The Python virtual machine handles memory management for you. It also handles some processing concepts for you.

    You'd need to start by finding (or inventing) a solution that compiles Python to assembly without the need of an interpreter or OS in between you and the hardware. It's worth noting here that Python itself isn't even fully written in Python and is instead written largely in C because Python isn't a compiled language. You'd then need to extend Python with the ability to completely manage memory and processor threads without the VM. You'd need to do that because that's really the main purpose of an operating system.

    Something we learn in programming is choosing the right tool for the job. Python isn't a great option for this type of project because the requirements just to get to where you can start are so high that it's not really considered worth while. Is it possible, yes, in theory. But without the python interpreter and VM, you'd have to ask if you're really developing Python or something else that just uses pythons syntax.

23 comments