Update_index function sometimes never returns

josh · June 2021

We have a continuously running process that updates our indexes throughout the day. Sometimes a call to update_index never returns. Is this something anyone has encountered before?

Also, it seems to to get stuck on the same tables, so that seems important.

DonBakke · June 2021

I know that corrupted indexes can cause this to happen.

josh · June 2021

I think that is true. We don't really validate length as far as i can tell, so we are vulnerable to that length thing you mentioned in another thread.

Do you know if anything bad can happen if you kill OI while it's in update_index?

josh · June 2021

Also, do you know what happens if two processes call update_index concurrently? I found an old thread on the rev forums that vaguely mentioned that update_index itself locks something and refuses to proceed if it can't lock this thing, but I couldn't make much sense of the post.

DonBakke · June 2021

When I mentioned "corrupted index", I wasn't exactly referring to an index that might be poorly created due to issues such as column length. While that could be considered a "corrupted index", insofar that the index is incomplete or inaccurate, what I really meant was an index that has internal problems that prevent the index from being flushed properly (aka. updated) or rebuilt. In the latter case, if the internal structure of the index table is corrupted, the index logic will hang because it is unable to navigate through and complete its task. We recently had this exact situation occur with a client and we identified the problem through a very careful investigation of the index table records.

If you care to investigate this situation on your own, then I recommend you first read this blog article:

Indexing in OpenInsight Part 2 - How index transactions get created

This will provide you a foundation that can help you identify the source of the corruption on your own.

Any process that updates an index, including the Update_Index routine, places a lock on the index table so only one process will ever be updating the index at the same time. So, if one Update_Index routine has hung with a lock, a second Update_Index routine might also get hung waiting for the lock to be released. I don't know the internals of that routine so I don't know if it times out if the lock doesn't become available in a reasonable amount of time. I suspect it just waits indefinitely.

DonBakke · June 2021

If you kill OI while it's in Update_Index, it will probably create more corruptions. However, since you probably already have a corrupted index this is not likely to be a problem. Chances are you'll be recreating these indexes anyway.

josh · June 2021

the table has 25 million rows, so...and the entire business will not be able to function without this table lol

DonBakke · June 2021

Not to make light of the situation, but this happens and it is a royal PITA. My suggestion is to take a copy of the table (and its index) offline and go through the index rebuild process (assuming, of course, you've confirmed the indexes are corrupted). Then create a journal on your active table, such as in a MFS, to track which records are added/modified/deleted in the interim. When when your indexes are restored, you can replace the live indexes files with the ones you just updated. You'll likely need a short maintenance window to do this to be safe. When create a script that simulates the activity on those records that you tracked in the journal. This will get the indexes up to speed.

BTW, if your table is that big, then it is likely your index table is heavily unbalanced (i.e., the OV file is significantly larger than the LK file). This would be a good opportunity to balance the files.

josh · June 2021

Thank you for your advice and the article. I have always wanted to knowhow the indexing works in rev. I will pass this info on to my team.

josh · June 2021

btw, do you know if update_index has yield calls inside of it, as I noticed that I am able to move the window around while update_index is processing.

DonBakke · June 2021

I can't be 100% certain, but an inspection of the object code does not show any references to Yield(). I believe Update_Index is really just a shell that calls other lower-level SSPs so perhaps one or more of those calls Yield(). You could create a log file to confirm this.

Update_index function sometimes never returns

Comments