First a comment on open source; I know that programmers writing
commercial scheduling programs frequent this list, I’ve talked to one. I believe
that this should be an open source project. It should be developed as a stand
alone engine that can be utilized by developers of both open source
administrative systems and commercial systems. I am willing to work with anyone
who shares this vision. I would not approve (although there is nothing I can do
about it) of anyone taking this start and adding good ideas in secret to it. If
you have a good idea please share it with the rest of us.
In more than 20 years of programming I have found one truth, everything
that fundamentally shapes programming is stolen. Most of it is stolen from
Nature. So when I look for a solution to the problem of building and scheduling
a master schedule I look to nature. Evolution is nature's problem solver, but
how does it work? What are the hallmarks of its work?
Evolution starts with DNA and makes random changes in the DNA. The
resulting organism needs only to pass a simple test to be successful.
·
It must be able to
procreate, which of course implies that it is alive.
Simply being alive is not enough (at least for long). This could be
described as make a change and test strategy.
What are the hallmarks of evolution?
·
It operates over a long
period of time
·
It operates on a
massively parallel scale. There is not much communication going on. Very few of
the objects (creatures) it creates interact with each other. For all we know it
could make the same mistakes and cover the same ground millions of
times.
·
It produces unexpected
results. I remember an evolutionary program written at Stanford many years ago.
Its goal was to produce an organism capable of moving. The different computer
creatures it came up with where totally unpredicted.
·
It has a simple binary
rule structure (live or die). Living could also be akin to learning, since it
starts the next iteration with a new smarter base.
·
Changes in the
environment some how drive it. Most of the time we are not evolving, evolution
is not doing much useful. But change the environment and evolution kicks in to
gear, making changes, filling niches.
·
The vast majority of
its changes are negative (to put it politely) and fail.
·
It builds on success;
its successes are the templates for the next generation.
·
?
Which of the above can we use?
- “Long period of time”, we may only have a month
or two for running. Good thing computers are fast.
- “Massively parallel scale”, this we need. The
more CPU cycles the better. Of course our massively parallel might involve a
couple of hundred entities (computers) vs natures billions. Since there will
be communication that prevents the computers from working on the same part of
the problem we will eliminate the making of the same mistake or going over the
same ground over and over and thereby cut down the time period
required.
- “Unexpected results”. This largely a factor of
the randomness and scale of evolution and it greatly increases the amount of
time required. For at least the first generation, I believe we will have to
eliminate the random portion and therefore the unexpected results.
- “Changes in the Environment”, I think we will
see the same here. The master schedule the second year will look remarkable
like the first year and it should. Change for no reason just upsets people.
But if there is a major change in the environment (new school, layoffs, change
to block scheduling etc) this should really kick into its own.
- “Builds on success” we are going to create some
life first (master schedules that work to some degree) and build on
that.
What are the hallmarks of a good master schedule?
- Courses should be taught in the least number of
rooms, by the least number of teachers and require the least number of
sections.
- Sections should be evenly balanced; each section
should have about the same number of students as other sections of the course.
Each section should have a consistent ratio of males to females.
- Teachers should work a consistent number of
periods per day in line with the contract.
- Each meeting time of a course should be on a
different day unless that are more meeting times than the master schedule
rotation (how many school days pass before the schedule repeats, typical 5 or
10). In such cases the number of times a course meets on the same day should
be minimized.
- Courses should be taught in rooms designed for
the course.
- Courses should be taught by qualified
(certified) teachers in the subject.
- Class sizes should be at or below the constraint
size.
- Teachers should teach all their class in a
single room.
- ?
At the end of a scheduling process there is always what I call a “playing
god” phase. Here someone (normal the principal) makes the hard choices to get
the school scheduled. In some schools it might involve 1 or 2 students in larger
schools it might be 5%. What kind of decisions does he/she make? These will be
the basis for our schedule mutations. By having these directed mutations vs.
natures random ones, we will cut down the CPU cycles necessary.
- Exceed the seat count for a section.
- Exceed the seat count for a room.
- Make teachers teach in multiple rooms. Do this
by seniority?
- Teach a class in the wrong type of room. English
in a chem. Lab?
- Have a teacher teach a course they don’t
normally teach or worst case is not certified to teach.
- Hire a new teacher.
- Substitute a course a student has not requested
for one that can’t be scheduled. First try an alternative or closely related
course (music for band) or one that fills the same graduation
requirement.
- Have a class meet more than once a day even if
the rotation does not require it.
- ?
Here then is my general plan:
- Generate every non-constraint violating master
schedule. The algorithm must allow them to be generated by number so I can
tell computer #1 to generate and check the 1st 1000 scheduled and
computer #2 the next etc.
- The program must operate as a screen saver so I
can run it on all computers and get every extra CPU cycle possible without
bothering other users. I figure an average computer in a school is only
operated 6-7 hours per day, giving me 18 hours of CPU time.
- Schedule each of these master schedules against
the course requests and report the status (master schedule # and scheduling
percentage) to the server. The server will insure that all schedules are
checked.
- Take the top 100-200-300 (what is a good live or
die metric beside percent scheduled?) or what ever schedules and start
mutating them by doing what the principal would do as describe above. Before
this whole thing the school would rate how onerous each of the possible
mutations was.
- Print a summary of the best results with a list
of mutations and let principal pick the best.