Exam 2 -- Optimal Expedition Pairs

The problem is available in brute at

Optimal Expedition Pairs.

'''
The problem is a slightly extended version of a 1D segment problem:
 
Given a list of 1D segments with integer end points,
find all pairs of segments which satisfy all three following conditions:
 
 1. The sum of length of segments in the pair is maximum possible
 2. The start of the second segment in the pair is at least 7 units (=days)
    after the end of the first segment in the pair
 3. The gap between the first and the second segments in the pair
    does not contain any start or end of another segment.
 
Note that in the exam problems, the segments are called expeditions.
That, of course, does not change the structure of the problem,
so let us stay with segments for a moment.
 
Let us suppose that the description of a segment is a list
with three elements
   [<segment_start> <segment_end> <text_description>]
The items <segment_start> and  <segment_end> are numerical,
the item  <segment_description>
is just some string storing additional human-friendly information.
We keep it there only because the exam problems specifies some
well defined output and <segment_description> is exactly what we output in the end.
 
At this moment we leave aside all ideas related to the dates in the year
which are in the exam problem, we just suppose that
<segment_start> and  <segment_end> are simple integers
and <segment_description> is just some string completely
unrelated to all intermediate computations.
 
The solution problem might be solved by applying two nested loops
to investigate all possible pairs of segments and to store
separately copies of those pairs which sum is the longest.
'''
 
def findOptiPairs( listOfSegments ):
    listOfOptimalPairs = []
    lengthOfOptimalPair = 0
    for segment1 in listOfSegments:
        for segment2 in listOfSegments:
            if pair (segment1, segment2) satisfies conditions 2. and 3.:
                 pairLength = lengthOf(segment1) + lengthOf(segment2)
                 if lengthOfOptimalPair > pairLenght: continue
                 if lengthOfOptimalPair < pairLenght:
                     lengthOfOptimalPair = pairLenght
                     listOfOptimalPairs = []
                 listOfOptimalPairs.append([segment1_description, segment2_description])  )
 
    # sort listOfOptimalPairs if necessary
    # and print it
 
'''
 
Now, to upgrade this solution scheme to a working solution of the
exam problem, a few additional procedures have to be considered.
Fortunately, as we will see, each of those procedures can be coded
in quite transparent and short manner.
Here is the list:
A: Extract the list of segments form the input list
B. Implement a function which decides if condition 2 and 3 is satisfied
C. Sort the list of optimal pairs.
 
 
Each of these additional procedures  A, B, C, can and
should be independent on each other and also on findOptiPairs function,
so we may feel free to implement them in nearly any way we can imagine.
The only entity shared among A. B. C. and findOptiPairs is the
data representation of the segments (so called expeditions!) in the format
described above -- [<segment_start> <segment_end> <text_description>].
 
A.
To simplify things a bit, represent each day in a year by a single integer
denoting the order number of the day in the entire year.
For that aim, utilize the array
predDays = [0, 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334]
which registers, for each month, the total number of days in the year
before the 1st of that month. Then, given  the month m and day d,
the order number of the day is
   day + predDays[mon]
which is done in function
   numOfDay(dateStr):
where dateStr is a string containing a date in format DD:MM.
The function extractDatesFromLine( line )
extracts all substring in the format DD:MM form a given text line
converts them into a list of dates and chooses the minimum
in that list as a start date of the segment (expedition)
and chooses the maximum  in that list as an end date of the segment (expedition)
specified by the given line of text.
Before the conversion into number of a day, the date has to be checked for validity
by function isDate. This function ensures that there are exactly two dots in the
string and that the substrings before the first dot and between the dots are
integer numbers in corresponding to month and number of days in the month.
 
B.
Satisfying condition 2. and 3.
Condition 2 is straightforward to check, it is just a difference between two numbers
  if segment2_start - segment1_end <= 7:    return False   #create the  the gap is too small
 
Condition 3
We suggest a fast and easy to comprehend way of testing.
Let us assign to each day in the year, a value True/False
depending on whether that particular day is the beginning or the end
of some segment.
We can store this information  in an array of True/False values
where the index corresponds to the number of the day in the month.
The name of the array may be e.g. forbiddenDays.
Then, for any pair of segments, we just check (using array slicing)
     if True in forbiddenDays[segment1_end+1: segment2_start]: return False   # the gap is not accepted
(How to create array forbiddenDays - se the final code.
 
C.
Sorting the output list of pairs of segment.
When we think about it for a while, it appears that the list is already nearly sorted,
due to the order in which we scanned (in nested loops) the  list of segments.
We just have to sort the list by one additional criterion and the is the order
of the segment starts (= start days of the first and then the second expedition).
We append on additional value at the beginning of each pair of expedition.
This additional value is artificial, it just expresses the correct order.
The value is <segment1_start> * 365 +  <segment2_start>.
It ensures that the pairs with low <segment1_start> come first,
no matter what their value of  <segment2_start> is.
Also, each unique pair <segment1_start>,  <segment1_start>
is represented by a unique value, therefore at the sort
distinguishes all possible different pairs   <segment1_start>,  <segment1_start>.
After the sort is done, we neglect this additional value and 
only print the list.
'''
 
# ----------------------------------------------------
#    T H E   F I N A L    C O D E
# ----------------------------------------------------
 
mDays = [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
 
predDays = [sum(mDays[0:j])  for j in range( len(mDays))  ]
# predDays = [0, 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334]
# predDays the register, for each month, the total number of
# days in the year before the 1st of that month.
# It simplities calculation of the number of a day in a year
# day number = day_in_month + predDays[month]
 
gapsize = 7
 
def isDate( dstr ):
    pieces = dstr.split('.')
    if len(pieces) != 3:            return False
    if len(pieces[2]) > 0:          return False
    if not pieces[0].isnumeric():   return False
    if not pieces[1].isnumeric():   return False
    day, mon  = int(pieces[0]), int(pieces[1])
    if not 1 <= mon <= 12:          return False
    if not 1 <= day <= mDays[mon]:  return False
    return True
 
def numOfDay(dateStr):
    pieces = dateStr.split('.')
    day, mon  = int(pieces[0]), int(pieces[1])
    return  day + predDays[mon]
 
def extractDatesFromLine( line ):
    dates = [ [numOfDay(word), word]  for word in line.split() if isDate(word) ]
    if len(dates) == 0:  return  -1, -1,  ""
    dates.sort()
    # min, max and their str representation
    return dates[0][0], dates[-1][0], dates[0][1] + " " + dates[-1][1]
 
 
def isGoodPair(exped1, exped2, forbidDays):  # all conditions except optimality
    if exped1[0] == exped2[0]:                     return False
    if exped1[2] + gapsize >= exped2[1]:           return False
    if True in forbidDays[exped1[2]+1: exped2[1]]: return False
    return True
 
 
def findOptiPairs(expeds):
    forbidDays = [False] * 365
    optiExpeds = []
    optiExpedLen = 0
    for e in expeds:
        forbidDays[e[1]], forbidDays[e[2]] = True, True
    for expe1 in expeds:
        for expe2 in expeds:
            if isGoodPair(expe1, expe2, forbidDays ):
                pairLen = expe1[2]-expe1[1] + expe2[2]-expe2[1] + 2
                if pairLen < optiExpedLen: continue
                if pairLen > optiExpedLen:
                    optiExpedLen = pairLen
                    optiExpeds = []
                optiExpeds.append([expe1[1]*365+expe2[1]]+ expe1 + expe2 )
 
    # output results:
    print( optiExpedLen )
    for optiPair in sorted(optiExpeds):
        print(optiPair[4], optiPair[8] )
 
# ----------------------------------------------------
#    M A I N
# ----------------------------------------------------
# expedition is represented as a list with 4 items:
# [lineNo, startDayNum, endDayNum, expedDescr]
# startDayNum, endDayNum are integers   day numbers in the year
# expedDescr is a string which describes the expedition and is printed in the output.
 
N = int(input())
expeds = []      # expeditions
for lineNo in range(1, N+1):
    line = input().replace(",", "")
    startDayNum, endDayNum, startEndString = extractDatesFromLine( line )
    if startDayNum != -1: # if line contains expedition
        expedDescr = str(lineNo) + " "+ startEndString
        expeds.append([lineNo, startDayNum, endDayNum, expedDescr] )
 
findOptiPairs(expeds)