Work, Work, Work - Oracle and Visual Studio .Net (2005)
March 1, 2008
By Scott Lewis
Introductory paragraph goes here.
Things the last few weeks have been crazy busy. I normally don't talk about work on this web site, but I am going to do that now. I plan to be vague enough that if you don't know me personally you won't be able to know where I work or exactly what my company produces.
In early February we installed all new code that I wrote for out print production. We are migrating off an HP 3000 mainframe. The code base I inherited when I started at this company over 6 years ago was all in Visual Basic 6 pointing to an Oracle 8i database.
Some time after getting Oracle 10g up and running in production (for some features) we copied the schema from 8i to 10g (by way of a virtual machine running Oracle 9i, you can't directly export/import from 8i to 10g).
Well, I have been rewriting all the applications in .Net (Visual Basic 2005). The old system was all in-line SQL. Now we do it the right way and use all stored procedures for all data access. You can't read or write to the database from the user IDs we setup. You can only run stored procedures. Cool.
On the weekend we installed my software we install all the software responsible for a major portion of out print production. This software account for about 75% of my companies revenue and profits. So this was a very high profile rollout.
Part of the rollout was a change in the way work is handled. In the old system we allowed printer operators to batch up work and submit it to be composed into a Postscript job. Once the Postscript is composes they can send the work to a printer.
The new software includes going to a new "auto batching" routine. Instead of an application that they can select work from to batch I wrote a Windows Service that checks a time table and command table every minute. If the command table contains a command it is executed. The command table is populated by a web application that allows them to select products to print. This was done as a backup kind of application, as we should not need to manually batch work anymore.
The part of auto-batching that checks the time table looks to see if it is time to batch. It batches every hour. When it batches it just batches everything in the system that will fill the product without wasting stock (called voids, or empty cells). Twice a day we batch "old" orders. Old orders are 3 business days old. If a product does not have enough orders to prevent voids then it will be batched when one of the orders we do have is three days old. We take the hit for voids, and wasted material to guarantee three day processing time for out products.
I have been chasing bugs since this went into production. The largest part of the problem has been product setup and templates. These were not my responsibility. One of my co-workers built a web application to perform product setup, and it is th marketing and manufacturing people that are responsible to setup the products in the new system and get all the templates setup correctly.
We made a couple of fundamental change to some of the templates that caused a lot of work to get them right. The changes make new product setup easier and more flexible, but we have a few hundred product that each had to be looked at to see if they were correct.
As of this writing I am dealing with a brand new issue. We are failing to update a table in Oracle. When we batch the orders together we insert a record to a RUNS table. This table keeps track of where the run is in the system, and which programs should working on the run. WE also insert records into a CELLS table to keep track of which orders are on the run. When we insert the RUNS record we set the status in that record to 300 (batched).
The batching program reads the RUNS and CELLS tables to build an XML file used by another program. If it is successful and also submits the run to a queue for the next step it updates the RUNS record to status 301. If the batching gets an error for any reason it updates the RUNS record to status 100 (error).
The next program is called an XML Processor. It takes the XML file and builds the input files for our Postscript composer. If the XML Processor finishes successfully and puts the run in the queue to be composed then it updates the RUNS record to 302 (composing).
When the composer finishes generating the Postscript output it runs a program to update the RUNS record to status 304 (ready to print). Ready to print is actually ready to send to a printer, as we have 6 in use right now. The one program the users get is an application that lists everything in status 304 in the RUNS table and displays it to them in a grid so they can highlight the runs and send them to the printers.
The bug I have been having trouble with is the failing to update the RUNS record. For some reason some runs never go to 301, 302 or 304. It is really weird. They stay in status 300. The weird part is that once a runs gets stuck at 300 it never goes further. Three different program each update the status without checking the status. They blindly update the status field through the same stored procedure at their respective time. None of the programs have any trouble updating the status of runs that went to status 301 correctly in the first place. About 20% of the runs are getting stuck in status 300 even thought the run made it to a correctly composed Postscript document.
I added a check for "rows updated" by the stored procedure and logged that in two of the three apps. I have not added the logging of the rows updated to the Windows Service because that takes too long to build and install the update. The other two programs were easy so I put the logging in and they report they updated 1 row... yet they never updated the row. That's weird. I am really stumped.
I may not get this problem solved by the time I publish this article. My next step is going to be to add an update trigger to the RUNS table. I am going to write to a new table the run number, status and date/time on the update trigger. This will allow me to determine if the stored procedure is really updating the table or not. I think this may be an Oracle issue, so I will probably get our DBA to help me.
Sorry I only had one topic this month. It has been very busy and it still is. I hope to spend more leisurely time with my computer next month.