Python Module Shennanigans
This is a note for me personally, because I keep running into problem when working on Python. Don’t you just hate it when Python goes
ImportError: attempted relative import with no known parent package
PYTHONPATH 🔗
Modules are searched from pythonpath: a list of paths to perform search on
To print pythonpath, put this at the very top of the first invoked python file:
import sys
print(sys.path)
Usually PYTHONPATH contains something like this:
[
'containing folder of the entrypoint script',
'.../lib/python312.zip',
'.../lib/python3.12',
'.../lib/python3.12/lib-dynload',
'.../lib/python3.12/site-packages'
]
This is the reason why this usually doesn’t work
$ tree
app/
module1.py
module2.py
entrypoint/
entrypoint1.py
entrypoint2.py
$ cat entrypoint/entrypoint1.py
import app.module1
import app.module2
...
$ python entrypoint/entrypoint1.py
... app.module1 not found
What would make things work
$ PYTHONPATH="$(pwd)" python entrypoint/entrypoint1.py
Relative imports 🔗
You can prefix import like this import .x.y to signify relative import
app/
x/
y.py
a.py # import .x.y
main.py # import app.a
You can use multiple preceding . to navigate upward in the module tree. You can only do from ... import ... method if you are using relative import
app/
u/
v.py
x/
y.py # from ..u.v import v_const ; this would be equivalent to `from app.u.v import ...`
a.py # from .x.y import y_const ; this would be equivalent to `from app.x.y import ...`
main.py # import app.a
You cannot navigate up to/above the level of the root folder (containing the entrypoint)
u/
v.py
x/
y.py # from ..u.v import v_const
main.py # import x.y
# This would throw the following error
$ python main.py
Traceback (most recent call last):
File ".../main.py", line 1, in <module>
import x.y
File ".../x/y.py", line 1, in <module>
from ..u.v import v_const
ImportError: attempted relative import beyond top-level package
You cannot even relative import in the root folder level
x/
y.py
main.py # from .x.y import y_const
Traceback (most recent call last):
File ".../main.py", line 1, in <module>
from .x.y import y_const
ImportError: attempted relative import with no known parent package
How to deal with it 🔗
Maybe coming from a Node.js background, you are accustomed to a folder structure like this:
package.json
.gitignore
...
src (or app)/
main.js # const helloRoute = require("./routes/hello") or import helloRoute from "./routes/hello"
routes/
hello.js
goodbye.js
middlewares/
auth.js
logging.js
Then in your root folder, you would just run node src/main.js.
Python however, follows a module/package mindset. Your source code for the backend (routes, middleware, models, etc.) is just another package (no different than fastapi, numpy) that your main entrypoint script imports to execute. Which means, the entrypoint needs to be totally separated from the rest
pyproject.toml
.gitignore
...
app/
routes/
hello.py
goodbye.py
middlewares/
auth.py
logging.py
main.py
So what about projects that have multiple entrypoints? Maybe you have a main script for starting the HTTP server, but also another script that starts an alternative ETL pipeline that processes incoming data and store it in the database. One way to go is just put every entrypoints at the root level
app/
models/
user.py
order.py
routes/
auth.py
logging.py
...
main.py # start the HTTP server, use models, routes
etl.py # also imports models
But after a few entrypoints your codebase might be quite cluttered at the root level.
User @matejcik on StackOverflow give us the following solutions:
- Turn the
app/folder into an installable package. At this point,import appis not any different in nature fromimport json,import numpy, etc… - Restructure the codebase so that there is only one entrypoint. You would use
argparseor something similar to direct entrypoint to different functionalities
$ python main.py http # starts the http server
$ python main.py etl # starts the ETL listener