Welcome to the SRP Forum! Please refer to the SRP Forum FAQ post if you have any questions regarding how the forum works.

Observation of Write/WriteV failure, but no error raised by OI

I have code that attempts to update a field in a table record, with error-handling in the Else part of the WriteV statement, but the program does not enter the Else section despite that the record is not updated with the new values. This happens during concurrent writes. I expected to see an FS104 or FS133 or similar error. No such error is raised; instead the program executes the Then part of the WriteV statement where I verify manually that the WriteV has failed. I can and will wrap any such code in a Lock/Unlock pair, but does OI's Lock statement do better with failures than Write/WriteV apparently does? Or does my code need something else in it to guarantee that the Else part of WriteV will be entered after a write-failure?

This is the relevant part of my code, with some extra error-handling omitted:
service DeleteValue(value, table, key, field) failedCheck = 0 table.fldUpd = "" open table to table.file then if RowExists(table, key) then readV table.fld from table.file, key, field then locate value in table.fld setting pos then table.fld = Delete(table.fld, 1, pos, 0) Flush GarbageCollect writeV table.fld to table.file, key, field then // // Check that writeV wrote. // Flush GarbageCollect readV table.fldUpd from table.file, key, field then debug locate value in table.fldUpd setting pos then failedCheck = 1; * Previous write failed. action = "Write" gosub ErrorData end end end else action = "Write"; * This section is not entered. gosub ErrorData end end end else action = "Read" gosub ErrorData end end end else action = "Open" gosub ErrorData end end service /* DeleteValue */

Any help is appreciated.

Comments

  • edited March 2023
    Deleted
  • Have you checked that table.fld is changed to what you expect it to be before the writev?
    I mean, I can't see any reason why it wouldn't be, but was thinking just maybe, the writev is working but you're writing away a different value to what you're expecting to see after the read.
  • Can you confirm that table.fldUpd and table.fld are different and that table.fld has the 'value' removed
  • Why the multiple Flush and GarbageCollect statements? I don't think it is safe to call these while in the middle of a process. This could very well be a red herring, but I would at least comment those lines out and confirm the problem still exists.
  • Mark and Barry, I wondered about these points too, and I had to rule out such possibilities. Below I have added debugger screenshots showing table.fld before the deletion and the write and table.fldUpd after the read (which is the write-check).

    Don, Hi, how're you going? I added Flush and GarbageCollect as a last resort after a suggestion by a colleague. I find they have no effect on the WriteV. Flush alone seems to help with the subsequent ReadV. Without it, ReadV frequently shows table.fldUpd with 2 values as well as 3 values during many runs, this despite the fact that reopening the record afterward in the editor shows the same 3 values as before the concurrent access. Without Flush I think caching interferes with what ReadV actually reads. When I step through in debug mode, I never see the false read of two values; it is always 3 values in the debugger, confirmed by reopening the record afterward in the editor.

    Record before concurrent access:


    Field before deletion of value 012345:


    Field before WriteV:


    Field after ReadV (write-check):


    Record reopened after concurrent access:


    Incidentally, this is the ErrorData internal subroutine I used to collect error data:
    ErrorData: stamp = TimeDate() access = table: "." :key: " <" :field: ">" if failedCheck then error = action: " failed despite no error raised by OI: " end error := "Status(): " : Status() : "; @FILE_ERROR: " if @FILE_ERROR<1> then error := "FS" : @FILE_ERROR<1> if @FILE_ERROR<2> then error := ", " : @FILE_ERROR<2> if @FILE_ERROR<3> then error := ", " : @FILE_ERROR<3> Response = 0 :@VM: stamp :@VM: action :@VM: access :@VM: error return
  • btw: Is this OI9 or OI10
  • edited March 2023
    I have recreated table and code in OI9 and OI10 and it works for me.
    I just made your code an OI subroutine though.
  • OI 9.4.0

    So are you saying that on a write-failure your program enters the Else part of the WriteV, starting line 26 in the service above?
    end else action = "Write"; * This section is not entered. gosub ErrorData end

  • edited March 2023
    No, I do not get a write failure, the record is updated with the value removed.
  • The write-failure happens for me when two programs concurrently access the record. I have a ten-minute loop which I run separately; while that is running I try to write using the service above. The writeV from the service fails (verified in an editor after the looping script finishes), but the service does not report the error in my error log.
  • Hi Vince, I'm doing fine thanks!

    Have you tested this using Read / Write rather than ReadV / WriteV? Also, what if you use Read / WriteV?

    I have not used ReadV / WriteV extensively. However, I know they are called by RTP7 and RTP8 respectively and then they go through the READ.RECORD and WRITE.RECORD file system primitives the same way that the Read and Write statements do.

    It would not surprise me if some kind of caching is involved. WriteV must be able to protect the other fields from being updated so maybe the original record is cached and then retrieved before the row is written using RTP8.
  • I used Read and Write first up, but I ran it again today to get screenshots. Same result as yesterday with ReadV and WriteV (using a separate program in a ten-minute write-loop on the record to force a write failure from this service subroutine). Likewise with Read/WriteV.

    Now that I have a pointer to the primitives, I see from the prog. ref. manual that although READ.RECORD indicates the result of the Read in a STATUS argument, WRITE.RECORD does not:
    READ.RECORD
    Purpose Used to read a record directly from a file.

    Argument: STATUS
    In: unassigned
    Out: true if record was read successfully

    WRITE.RECORD
    Purpose Used to write a record to a file.

    Argument: STATUS
    In: unassigned
    Out: unchanged


    Record before concurrent access:


    Before deletion of value "012345":


    Before Write:


    After Read (write-check):


    Record reopened after concurrent access:

  • Have you tested this on another table?
  • I've done that now - separate table, record, field, value, using Read/Write. Same result again: The write fails as expected when another program is writing to the record, but no report of the failure. Incidentally, when I delete the value manually in the SRP Editor while a program is writing to the record, the editor reports that the record has been saved, but reopening the record shows that it was not saved. Does the editor also use WRITE.RECORD?
  • I should have thought to ask this at the beginning, you do have the Universal Driver properly installed and configured don't you? You have ServerOnly=1 in the REVPARAM file?
  • Yes, I'm running UD 4.7.2.0. My Revparam has:
    ServerOnly=1
    ServerName=vince_opto
    TcpIpPort=777
  • edited April 2023
    Does the editor also use WRITE.RECORD?

    I think a little explanation about the primitives is in order.

    Nothing uses these primitives in the way I think you are framing the question. Think of the primitives as event handlers for database transactions.

    The real question would be, what does the SRP Editor use to write/read/delete a record? The answer is the Write statements, Read statements, and Delete statements (respectively).

    Each of these statements is the normal approach to initiate a standard transaction with a table. This triggers what is called the File System stack. The File System stack is comprised of zero to n number of Modifying File System (MFS) routines and then the Base File System (BFS) routine. There must always be the BFS, as this is the lowest level procedure that communicates directly with the server. The BFS is typically supplied by Revelation Software. RTP57 is the name of the BFS for the Linear Hash database.

    An MFS is a hook that we can add to a table that intercepts the transaction before it gets to the BFS. Some MFS routines are provided by Revelation, such as SI.MFS, which will then react to a completed transaction and update a secondary table as needed. Developers can provide their own MFS routines to do whatever is required.

    Regardless if this is an MFS or the BFS, all transactions get passed through the appropriate primitive. WRITE.RECORD is the primitive associated with any processes that writes the record to the table. This includes the Basic+ Write statement, the Write_Row routine, or anything else.

    Getting back to the problem at hand, if I read your results correctly, WriteV is a bit of a red herring. Is that correct? That is, you are able to duplicate the problem using just Read and Write statements? If so, then that simplifies the issue but it also moves the goal post.
  • Yes, the same problem occurs when I use the Write statement.

    The reason I used WriteV was an assumption of efficiency: Write one field rather than the entire record. But now, if I understand what the WRITE.RECORD primitive does (it appears to write the whole record), using WriteV over Write perhaps makes no difference on that count.
  • I'm 99% certain it offers no performance benefits. It is really to isolate and protect the field being operated on. I think relational indexes use WriteV for this very purpose.

    Well, I have no other ideas to offer. This behavior is unexpected. Honestly, if it was a normal behavior this would have been shouted from the rooftops long before now.
  • Don, after your helpful explanation of how reads and writes are handled in OI, I stumbled on RTI_WRITERELEASE(..) which apparently does return FS error numbers. As I was replacing my Write statements with calls to this function, a random thought occurred to me: In my ten-minute loop, what am I writing back to the table? Stupidly, I was writing back an old copy of the record over and over again. I shifted the Read down a couple of lines and tried out the Write statement in my test program while the write-loop was running, and now I get my updated record (same with a manual save in the SRP Editor). So the Write statement was successfully writing to the record the whole time, but it was being immediately overwritten by the looping program!

    I was remarking to a colleague only yesterday about how one can fail to see the forest for all the trees in the way, and here I've done it one more time. My apologies to everyone involved in this discussion for the time spent.

    What is reassuring, now that I have the test right, is that for one minute the looping program on my machine reads and writes the record over 450,000 times, and my test program successfully updated the record for the several times I ran the test this morning.
  • A guess this is a bittersweet conclusion. Regardless, I'm glad you solved your problem and you also, ironically, proved that OI works as expected.
  • It's a relief, frankly. And it does open my eyes to the need for systematic locking which, in our code, I don't detect in many places (apart from the forms); because despite successfully updating a record, the program from another station can still overwrite it with stale data.
  • Your next level feat will be mastering transaction processing.
  • That is very handy. I think I already have a use for that.
Sign In or Register to comment.