[MUSIC] And we now move to the next topic that,

once again, seemingly has nothing to do with

the sequence comparison. We are now on a sightseeing tour in

Manhattan. And our goal is to walk from the

intersection shown in blue to the intersection shown in red.

And to visit as many attractions as possible. The only restriction is that we can either

move south or east. What should be our strategy to visit the maximum number of attractions on this

trip? We can model Manhattan as a grid, and show

edges in this grid where there are attractions as edges of rate one, and then

our goal is to find a path in this graph that has maximum number of

attractions. Sometimes computer scientists call this a longest path in this graph from the

blue node to the red node. We can actually try to solve this problem

in an arbitrary grid with this arbitrary number of

attractions, like this grid shown here. There are many different ways to travel in

this grid. For example, we can travel like this, and in this case, we’ll

visit 18 attractions. The length of the longest path is 18. Is it an optimal path?

No, because we can travel differently, and in this case, visit 20 attractions.

What should be our strategy, for the optimum exploration of Manhattan.

We need to solve the Manhattan tourist problem. The input is a weighted rectangular grid,

and the output is a longest path from the source

vertex (node), the vertex shown by blue, to the sink

vertex (node) that is shown by red in this grid. So what should we do? The simplest thing that comes to mind is

to explore a greedy strategy. For example, being in the very beginning,

we can either move east or move south. If you move east, we will visit three

attractions immediately. If we move south, we will visit only one. It makes sense to move east.

Let’s do it. Then, afterwards, once again, we can move

either east or south. We make choice based on the more, maximum

number of attractions. And we continue like this, and finally, we arrive to the source, visiting 23

attractions. And you have already guessed that this is a very simple

strategy. But not an optimal one. We need now to find the optimal strategy. Another thing we need to keep in mind is that Manhattan is not a perfect

rectangular grid. Broadway cuts across. And we can model this grid as an arbitrary

graph where edges can go from whatever vertex (node)

to whatever node. And how do we travel in this grid? And to travel in this grid, we need to

solve (find) the longest path in a directed graph problem,

where the input is an edge-weighted directed graph with the source

and sink nodes, and the output is simply a longest path from source to sink

in this graph. Now you may be surprised by now why I’m talking about alignment game and

Manhattan. Do you see a connection between the longest path problem and the

alignment game? And it may be not obvious that there is a connection but let’s try to figure out

what is it. For every column of the alignment

matrix, let’s code it with an arrow, as I showed in the slide. And then, after we design this arrow, let’s see how

this arrow would translate into the new grid that I have constructed

and presented in this slide. The first arrow is diagonal. Let’s move diagonally in our grid. The next arrow is also diagonal, let’s

move diagonally again. The next arrow is horizontal. We continue further horizontally. Diagonal, diagonal, vertical, diagonal,

vertical, diagonal, and actually using alignment matrix, we

were able to travel in the Manhattan grid. Right? Now, let’s ask a reverse question. If we have a path in Manhattan grid, would we be able to construct the

alignment matrix? Let’s try. This is an arbitrary path in the alignment grid.

Let’s combine all the arrows in one place. And as soon as we’ve done it, of course we can construct the alignment matrix as I

showed here. Therefore, alignments are nothing but

passes in the grid. And therefore, to play the alignment game, to construct longest common subsequences,

the only thing we need to do is to travel

optimal ways in the graph. And the often question for sequence

comparison problem in biology amounts to build an

appropriate Manhattan. We’ll do it a lot in this lecture. In the case of the longest common subsequence,

what would be this Manhattan? In this case, diagonal red edges

correspond to matching symbol and have a score of one. If these corresponding symbols for these

diagonal edges match to each other.

And highest scoring alignment in this case is simply the longest path in a

properly built Manhattan. And to learn how we travel in whatever

Manhattan, we need once again to talk about the

problem that seemingly has nothing to do with biology.

And it is the change problem.