Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Storing the PV in a search routine

Author: Brian McKinley
Date: 19:30:50 06/09/98

On June 09, 1998 at 15:09:16, Bruce Moreland wrote:

>
>On June 09, 1998 at 14:44:18, Scott Gasch wrote:
>
>>
>>Hi.  I'm a newbie at computer chess and am having some trouble storing
>>the PV with my search routine.  I am using an alpha beta search with
>>a depth identifier to moderate search depth.  Before I return from any
>>invokation of search I store the best move it found in a move array at
>>pv[depth].  I thought this would be the correct way to operate since
>>the search is depth first.  However, when I look at the move array later
>>it often makes no sense; for instance, the same piece is moved twice on
>>successive turns or pieces get taken where they are not located in the
>>deeper levels of the PV.  The first move in the array is always legal
>>and is (usually) a sound move.  This leads me to believe the search
>>routine is working but I am just missing some concept about how to store
>>the PV.
>
>I will invent some terminology first.
>
>I'll call the root of the search "level 0".  When this instance of
>search recurses, the recursed call is "level 1", when that recurses, you
>get to "level 2", etc.  The deepest level you happen to get to in the
>course of a line I'll call "level N", although "N" can change throughout
>the tree.
>
>Each possible level is going to need an array of moves, along with a
>count of how many elements are currently in the array.
>
>When you do a level N node, and the score comes back between alpha and
>beta, this is a tip PV node.  When you encounter one of these, you set
>your move element counter for this level of the search to zero, because
>there wasn't actually a move made here, you just evaluated a position
>and thought it was fine.
>
>Anywhere else in the search, if you get a score back that is between
>alpha and beta, you have to do something more complicated.  You take the
>move that you just made, the one that you recursed on, and you stick it
>at the head of this array.  You then take the array for the next deeper
>level of the search, and you append it to this array.  The number of
>elements in the array at this level is now the number of elements in the
>deeper level's array, plus one.
>
>When your search finishes, the PV is in the level zero array.
>
>The memcpy doesn't kill you, because you'll only do it a few hundred
>times per search, maybe.
>
>This sounds complicated but it is maybe five lines of code.
>
>bruce

Here is another approach:

With this implementation you only do a copy when a leaf node (level N
node in Bruce's terms) returns a score between alpha and  beta, and then
only after you have searched all other relevant descendents at that ply.
 At any other node you simply swap pointers if you fail high.

The easiest way to explain is by example.  Like with Bruce's
implementation you will need an array of moves for each ply.  For this
example we will encapsulate the move array in CMoveList to make the code
more readable.

class CMoveList
{
	CMove m_moves[MAXPLY];
	int m_size;
	int Size() { return m_size; };
	void Size(int size); {m_size = size;}
	CMove & operator [] (int i) {return m_moves[i];};
	CMoveList & operator = (CMoveList & movelist );
};

CMoveList * currentpath;
CMoveList * pv[MAXPLY];

currentpath = new CMoveList();

for (int i = 0; i < MAXPLY; i++)
	pv[i] = new CMoveList();

The important thing to remember is that a principal variation is
directly linked to an evaluation.  As we back up evaluation values, we
need to back up the corresponding principal variation.  To keep the
value and the principal variation together, we can create a structure
that contains a value and a pointer to the pv and write an AlphaBeta
function that returns this structure.

struct SResult
{
	int m_value;
	CMoveList * m_pv;
	SResult() { m_value = -INT_MAX; m_pv = 0;}
	// Copy constructor.
	SResult( int value, CMoveList * pv) { m_value = value; m_pv = pv;}
	// A little syntactic sugar so that we can treat an SResult like an
integer.
	SResult & SResult::operator = (int i)  {m_value = i; m_pv = 0; return
*this;}
	operator int() { return m_value;}
	SResult & operator - () { m_value = -m_value; return *this; };
}

SResult CPosition::AlphaBeta( short depth, int alpha, int beta, int
positionbest)
{
	SResult best, value;
	CMove * move;
	CMove bestmove;
	CMoveList * pv;

	if (GameOver() || depth < 0)
		best = Evaluate();
	else while (move = NextMove())
	{
		if (best > alpha )
			alpha = best;

		MakeMove(move);
		value = -AlphaBeta( depth - 1, -beta, -alpha, -best);
		UnMakeMove();

		if (value > best)
		{
			bestmove = *move;
			best = value;

			// As written this code does not support a negative depth.  You will
want to use something
			// other than the depth as the index in to this array.
			// current move number - start move number is good.
			assert(depth >= 0);

			// swap the pointers so the our backed up pv does not get over
written deeper in the search.
			pv = pv[depth - 1];
			pv[depth - 1] = pv[depth];
			pv[depth] = swap;

			// The pv at this depth should be the same as our backed up pointer.
			assert(pv[depth] == best.m_pv);

			if ( best >= beta)
				break;
		}
	}

	// Save the principal variation
	// Designed to work with a zero window or full window.  If you are only
using a full window, the condition can be
	// changed to check for best between alpha and beta and the
positionbest parameter can be removed.
	if (!best.m_pv && best < positionbest)
	{
		// Copy the current path
		// We will assume that current path is maintained by MakeMove and
UnMakeMove.
		*best.m_pv = *currentpath;
		best.m_pv->Add(bestmove);
	}
	return best;
}

Brian
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.