Best Practices Python split into multiple .py files?

Can anyone point me to some really good tutorials that explain the logic behind when and how to split up a large Python program into separate files? Any best practices on where to do the splits? What are the pitfalls? What will I need to remember about how to use those files later in the main program that beginners always forget?

LambTracker Desktop is getting unwieldy and I need to make it more compartmentalized but not clear on how to best organize the code.

In Android Java every class would be a separate file but not sure if that makes sense in the Python world.

1 Like

I don’t have a reference but I have a practice in my coding.

In one case, I might split by function. Three example categories are UI controllers, general purpose utility routines, and core processing routines. By example, a function that sorts an input array is general purpose, a function that takes the dialog to define how to sort the array is UI, and a function that processes a sorted array for a specific output is core.

On a multi-functional development, I might also split by sub-function. For example, in an image processing tool that I am creating, I have a function base to load data files, a function base to calibrate the image scale, and a function base to remove the background. Each function base is in a separate file and gets included at compile time to the main routine.

Finally, one idea is to split routines in a way that you only touch the files with code that you need to update systematically while preserving (not touching) the files with code that is “static” or “stable”. Here again, you may have a routine to sort an array that you should never need to touch again compared to the routines that should process the results from the sorted array.

—
JJW

3 Likes

I don’t recall seeing anything in the many books I’ve browsed/read on Python, but perhaps there is something somewhere.

My experience is that dealing with regular Python source file is easiest with a good editor that has Python-specific tools. I use Sublime Text most the time, supplemented by BBEdit. Sublime has some great ways of “folding” the file at selectable levels of hierarchy. So, when “folding all” all you see is the first line of each “def” (or class). As I don’t proactively split up Python source code files, I’ve seen one of the files in my most recent ‘project’ grow to ~12k lines, and I didn’t notice until I just looked! the whole “system” I have is about ~90k lines.

You could put each function/class in its own file, but feels to me too complicated and unnecessary (and adds complexity including stuff, I would guess).

That being said, if you are using Python with a particular framework, e.g. Django (which I like and use, not for web sites but for it’s terrific way of handling queries to a MySQL database), then to best take advantage of the framework they usually do provide guidance. Django documentation provides guidance and I’m sure there are many other web sites and books, but I won’t pursue here as I don’t know if you are using Django.

All that being said, if you must break it up, I’d say do it by “function”. And I would probably move stuff that is “done” and tested into so-called “library.py” that holds common stuff.

I know you are looking for more specific guidance, but I’ve never run across it.

2 Likes

The principles in this book will help you. It is not language-specific, but conceptual. It changed my programming life, and has allowed me to write and maintain an open-source application for the past 17 years with very few bugs, eliminating things ‘going south’ when a change is made, minimizing/eliminating ‘chain reaction’ type events where a change here mandates a change there, etc.

https://www.amazon.com/Pragmatic-Programmer-journey-mastery-Anniversary/dp/0135957052

image

1 Like

I follow the same principles I learned as a C++ developer, every class is a separate source file and every class should be small and have a well defined purpose. Don’t be afraid to create a class with just a few methods. Your computer can handle a project with hundreds of source files no problem.

1 Like

My app is designed about a series of screens or windows. each one deals with a particular subset of functions. For example right now I have Animal Search, Add/Edit Animal and Animal Reports all as very separate things on separate Toplevel tkinter windows. Population Analysis will result in several more screens and so on for major categories of stuff. Within a section, like Animal Search most of the code is only used there and to navigate to another section you go to a completely new screen.

I’m using PyCharm as the IDE so not an editor per se. I haven’t looked at folding possibilities at all, thanks for mentioning that.

Nope, standard Python 3.7 with tkinter and ttk as the UI toolkit. Trying for minimal excess stuff although I know I will be adding SciPy and Numpy once I get into the population analysis code.

Thanks for the book suggestion. I just got the sample on kindle and it looks like a lot of the conceptual stuff I am looking for. As an aside, what’s your open source project? LambTracker is also open source, none of the desktop code is out available on my GitLab repository yet but should be RSN I hope.

I appreciate that observation as well. I will be packaging the final LambTracker into a package for Mac, Linux and Windows probably using PyInstaller although exactly how isn’t yet decided. Goal is for the inexperienced user to be able to just run the app and all the python interpreter, modules and libraries etc are included. Most of my potential users wouldn’t know how to install a Python interpreter or load something like SciPy at all so I have to have it appear as a monolithic application to the end user.

If planning to use above, then be sure to look into and seriously consider Pandas…

“Trying for minimal excess stuff”

… I find relying on Django (for doing database interface) simplifies and (massively) reduces size of the app that I have to program with a lot of additional capability. And de-risks as less to test/debug.

Thanks for asking!

The usual way of making printed circuit boards is to chemically coat the copper, expose to light through a mask (often created with a laser printer on transparency film, sometimes ironed on, sometimes hand drawn), remove excess coating, then etch with ferric chloride (for instance) to remove excess copper, use another chemical to remove the remaining coating, then drill holes for mounting through-hole components, etc.

I got tired of ruining t-shirts with the chemicals, and making bad boards, so I wrote a plugin for a printed circuit design program called EAGLE that creates g-code (x,y,z coordinates and tool control code) that drives a computer-controlled router to cut out around the tracks of the printed circuit board and drill holes for components. This eliminates the need for toxic and messy chemicals, and automates the processing of drilling holes, which people usually do by hand. My program is in use by, I estimate, a few thousand people around the world.

Typical router, though this varies widely to include industrial CNC milling machines

Example PCB

Users Group

GitHub

2 Likes

I’m ignorant of what Pandas actually does. I’m going to be calculating coefficients of inbreeding, kinship/relationship, genetic distance, generation interval, breeding population etc.

Isn’t Django only for building web apps?

LambTracker is stand alone. Plan is to have it run as an ordinary app on Mac, Linux and Windows. My database is SQLite. For me the database part is easy, it’s the darned GUI that is a PITA to develop! First off delete never happens due to requirements for maintaining a history. I add records and I occasionally update records to fill previously empty fields but records do not get deleted.

Pandas is brilliant. It’s like a merged sql, numpy, and ORM. Makes traversing and manipulating large datasets require very little code and mostly performant.

For code organization, I haven’t been in charge of a codebase large enough where organization really mattered. But, at work the part of our codebase in python is separated be features. (In terms of files and directories).

Like with anything else With regards to organization, I think the most important thing is just to be consistent.

1 Like

As @dustinknopoff says, Pandas is brilliant. Quite prominent, and uses NumPy and SciPy. https://pandas.pydata.org.

Django used for web apps, but that’s not all it’s good for. I express my database model (can use all the leading db’s including SQLite) in Django which then makes it Python-friendly “objects”. Comes into its own for relational databases having numerous relationships (most data models do).

Neither may be for you, but if i were doing what I see you are doing, I’d do it on top of Pandas and Django.

Enjoy.

Neat!      

1 Like

Just wanted to say that I got the book and am really learning a lot from it. Thank you for the recommendation!

2 Likes