Observation of Write/WriteV failure, but no error raised by OI

vince · March 2023

I have code that attempts to update a field in a table record, with error-handling in the Else part of the WriteV statement, but the program does not enter the Else section despite that the record is not updated with the new values. This happens during concurrent writes. I expected to see an FS104 or FS133 or similar error. No such error is raised; instead the program executes the Then part of the WriteV statement where I verify manually that the WriteV has failed. I can and will wrap any such code in a Lock/Unlock pair, but does OI's Lock statement do better with failures than Write/WriteV apparently does? Or does my code need something else in it to guarantee that the Else part of WriteV will be entered after a write-failure?

This is the relevant part of my code, with some extra error-handling omitted:

service DeleteValue(value, table, key, field)
	failedCheck  = 0
	table.fldUpd = ""
	open table to table.file then
		if RowExists(table, key) then
			readV table.fld from table.file, key, field then
				locate value in table.fld setting pos then
					table.fld = Delete(table.fld, 1, pos, 0)
					Flush
					GarbageCollect
					writeV table.fld to table.file, key, field then
						//
						// Check that writeV wrote.
						//
						Flush
						GarbageCollect
						readV table.fldUpd from table.file, key, field then
							debug
							locate value in table.fldUpd setting pos then
								failedCheck = 1;                                                * Previous write failed.
								action = "Write"
								gosub ErrorData
							end
						end
					end else 
						action = "Write";                                                 * This section is not entered.
						gosub ErrorData
					end
				end
			end else
				action = "Read"
				gosub ErrorData
			end
		end
	end	else
		action = "Open"
		gosub ErrorData
	end
end service /* DeleteValue */

Any help is appreciated.

BarryStevens · March 2023

Deleted

AusMarkB · March 2023

Have you checked that table.fld is changed to what you expect it to be before the writev?
I mean, I can't see any reason why it wouldn't be, but was thinking just maybe, the writev is working but you're writing away a different value to what you're expecting to see after the read.

BarryStevens · March 2023

Can you confirm that table.fldUpd and table.fld are different and that table.fld has the 'value' removed

DonBakke · March 2023

Why the multiple Flush and GarbageCollect statements? I don't think it is safe to call these while in the middle of a process. This could very well be a red herring, but I would at least comment those lines out and confirm the problem still exists.

vince · March 2023

Mark and Barry, I wondered about these points too, and I had to rule out such possibilities. Below I have added debugger screenshots showing table.fld before the deletion and the write and table.fldUpd after the read (which is the write-check).

Don, Hi, how're you going? I added Flush and GarbageCollect as a last resort after a suggestion by a colleague. I find they have no effect on the WriteV. Flush alone seems to help with the subsequent ReadV. Without it, ReadV frequently shows table.fldUpd with 2 values as well as 3 values during many runs, this despite the fact that reopening the record afterward in the editor shows the same 3 values as before the concurrent access. Without Flush I think caching interferes with what ReadV actually reads. When I step through in debug mode, I never see the false read of two values; it is always 3 values in the debugger, confirmed by reopening the record afterward in the editor.

Record before concurrent access:

Field before deletion of value 012345:

Field before WriteV:

Field after ReadV (write-check):

Record reopened after concurrent access:

Incidentally, this is the ErrorData internal subroutine I used to collect error data:

ErrorData:
	stamp  = TimeDate()
	access = table: "." :key: " <" :field: ">"
	if failedCheck then
		error = action: " failed despite no error raised by OI: "
	end
	error := "Status(): " : Status() : "; @FILE_ERROR: "
	if @FILE_ERROR<1> then error := "FS" : @FILE_ERROR<1>
	if @FILE_ERROR<2> then error := ", " : @FILE_ERROR<2>
	if @FILE_ERROR<3> then error := ", " : @FILE_ERROR<3>
	Response = 0 :@VM: stamp :@VM: action :@VM: access :@VM: error
return

BarryStevens · March 2023

btw: Is this OI9 or OI10

BarryStevens · March 2023

I have recreated table and code in OI9 and OI10 and it works for me.
I just made your code an OI subroutine though.

vince · March 2023

OI 9.4.0

So are you saying that on a write-failure your program enters the Else part of the WriteV, starting line 26 in the service above?

                    end else
                        action = "Write"; * This section is not entered.
                        gosub ErrorData
                    end

BarryStevens · March 2023

No, I do not get a write failure, the record is updated with the value removed.

vince · March 2023

The write-failure happens for me when two programs concurrently access the record. I have a ten-minute loop which I run separately; while that is running I try to write using the service above. The writeV from the service fails (verified in an editor after the looping script finishes), but the service does not report the error in my error log.

DonBakke · March 2023

Hi Vince, I'm doing fine thanks!

Have you tested this using Read / Write rather than ReadV / WriteV? Also, what if you use Read / WriteV?

I have not used ReadV / WriteV extensively. However, I know they are called by RTP7 and RTP8 respectively and then they go through the READ.RECORD and WRITE.RECORD file system primitives the same way that the Read and Write statements do.

It would not surprise me if some kind of caching is involved. WriteV must be able to protect the other fields from being updated so maybe the original record is cached and then retrieved before the row is written using RTP8.

vince · March 2023

I used Read and Write first up, but I ran it again today to get screenshots. Same result as yesterday with ReadV and WriteV (using a separate program in a ten-minute write-loop on the record to force a write failure from this service subroutine). Likewise with Read/WriteV.

Now that I have a pointer to the primitives, I see from the prog. ref. manual that although READ.RECORD indicates the result of the Read in a STATUS argument, WRITE.RECORD does not:

READ.RECORD
Purpose Used to read a record directly from a file.

Argument: STATUS
In: unassigned
Out: true if record was read successfully

WRITE.RECORD
Purpose Used to write a record to a file.

Argument: STATUS
In: unassigned
Out: unchanged

Record before concurrent access:

Before deletion of value "012345":

Before Write:

After Read (write-check):

Record reopened after concurrent access:

DonBakke · March 2023

Have you tested this on another table?

vince · March 2023

I've done that now - separate table, record, field, value, using Read/Write. Same result again: The write fails as expected when another program is writing to the record, but no report of the failure. Incidentally, when I delete the value manually in the SRP Editor while a program is writing to the record, the editor reports that the record has been saved, but reopening the record shows that it was not saved. Does the editor also use WRITE.RECORD?

DonBakke · March 2023

I should have thought to ask this at the beginning, you do have the Universal Driver properly installed and configured don't you? You have ServerOnly=1 in the REVPARAM file?

vince · April 2023

Yes, I'm running UD 4.7.2.0. My Revparam has:
ServerOnly=1
ServerName=vince_opto
TcpIpPort=777

DonBakke · April 2023

Does the editor also use WRITE.RECORD?

I think a little explanation about the primitives is in order.

Nothing uses these primitives in the way I think you are framing the question. Think of the primitives as event handlers for database transactions.

The real question would be, what does the SRP Editor use to write/read/delete a record? The answer is the Write statements, Read statements, and Delete statements (respectively).

Each of these statements is the normal approach to initiate a standard transaction with a table. This triggers what is called the File System stack. The File System stack is comprised of zero to n number of Modifying File System (MFS) routines and then the Base File System (BFS) routine. There must always be the BFS, as this is the lowest level procedure that communicates directly with the server. The BFS is typically supplied by Revelation Software. RTP57 is the name of the BFS for the Linear Hash database.

An MFS is a hook that we can add to a table that intercepts the transaction before it gets to the BFS. Some MFS routines are provided by Revelation, such as SI.MFS, which will then react to a completed transaction and update a secondary table as needed. Developers can provide their own MFS routines to do whatever is required.

Regardless if this is an MFS or the BFS, all transactions get passed through the appropriate primitive. WRITE.RECORD is the primitive associated with any processes that writes the record to the table. This includes the Basic+ Write statement, the Write_Row routine, or anything else.

Getting back to the problem at hand, if I read your results correctly, WriteV is a bit of a red herring. Is that correct? That is, you are able to duplicate the problem using just Read and Write statements? If so, then that simplifies the issue but it also moves the goal post.

vince · April 2023

Yes, the same problem occurs when I use the Write statement.

The reason I used WriteV was an assumption of efficiency: Write one field rather than the entire record. But now, if I understand what the WRITE.RECORD primitive does (it appears to write the whole record), using WriteV over Write perhaps makes no difference on that count.

DonBakke · April 2023

I'm 99% certain it offers no performance benefits. It is really to isolate and protect the field being operated on. I think relational indexes use WriteV for this very purpose.

Well, I have no other ideas to offer. This behavior is unexpected. Honestly, if it was a normal behavior this would have been shouted from the rooftops long before now.

vince · April 2023

Don, after your helpful explanation of how reads and writes are handled in OI, I stumbled on RTI_WRITERELEASE(..) which apparently does return FS error numbers. As I was replacing my Write statements with calls to this function, a random thought occurred to me: In my ten-minute loop, what am I writing back to the table? Stupidly, I was writing back an old copy of the record over and over again. I shifted the Read down a couple of lines and tried out the Write statement in my test program while the write-loop was running, and now I get my updated record (same with a manual save in the SRP Editor). So the Write statement was successfully writing to the record the whole time, but it was being immediately overwritten by the looping program!

I was remarking to a colleague only yesterday about how one can fail to see the forest for all the trees in the way, and here I've done it one more time. My apologies to everyone involved in this discussion for the time spent.

What is reassuring, now that I have the test right, is that for one minute the looping program on my machine reads and writes the record over 450,000 times, and my test program successfully updated the record for the several times I ran the test this morning.

DonBakke · April 2023

A guess this is a bittersweet conclusion. Regardless, I'm glad you solved your problem and you also, ironically, proved that OI works as expected.

vince · April 2023

It's a relief, frankly. And it does open my eyes to the need for systematic locking which, in our code, I don't detect in many places (apart from the forms); because despite successfully updating a record, the program from another station can still overwrite it with stale data.

DonBakke · April 2023

Your next level feat will be mastering transaction processing.

vince · April 2023

That is very handy. I think I already have a use for that.

Observation of Write/WriteV failure, but no error raised by OI

Comments