Author: Viet Tran (Page 1 of 14)

The performance effect of the Download button in Maximo

Introduction

If you have been working with Maximo for a while, you already know about the Download button on the top left of every table in Maximo. With one click, it will export everything we see displayed in the table into an Excel file. This is great for doing further data analysis and reporting from that data. It is so simple and convenient, right? Not quite.

Performance Degradation

The danger with the Download button is that, since it is too convenient, everyone is using it for every data requirement. One frequent problem is the user keep asking us to add more columns to the List tab of the key applications like Work Order Tracking. Most of the time, the Maximo Administrator will comply to such requests in a blink (another problem of Maximo being too easy to customize). Often, many of those columns are retrieved via relationships. One additional column usually does not really make much difference. But if there is a high number of records, and as the amount of user activities increases, it will create a snowball effect in degrading the overall system performance and people will start complaining about Maximo being slow. 

However, the real problem with the Download button is that, by default, there is no limitation set for it. Usually after Maximo was first implemented, it worked great. After several years, the amount of data grew, and people start using this method to download data for various reporting requirements. The Download button can significantly affect system performance.

Measure the impact

For a client that I recently worked with, many users frequently use the Download button to extract a large amount of data (e.g. all work orders in one year) to create their own reports in Excel. This led to a tremendous amount of stress on the servers.

First, we must realize that the output of this “Download” function is an XML file (although the extension is XLS), Maximo consumes a lot of processing power and memory to generate the XML file. To fully understand how it affects the server, I did a small test by setup a copy of the client’s database a local VM. I opened the Work Order Tracking app and try to download all active work orders (15k records). That took around 15 minutes to generate and download the file. Then I tried to download all work orders reported in the last year (this includes both closed and active work orders). It took 50 minutes. And during this whole time, the VM’s CPU and memory utilization was saturated the whole time.

CPU and Memory usage while Maximo is processing the download request
CPU and Memory usage after the processing completed

Crashing the server

To test the worst-case scenario, I opened the SR application, and click on “All Records”, then clicked on the Download button to download all 600 thoudsand records. Users can easily make this “mistake”, and once they clicked on the Download button there is no option for them to cancel the process.

At first, the process saturated CPU and memory utilization for more than an hour, after that, the session expired. However, in the background, SQL Server process continued running. It was consuming 30-40% CPU utilization for the entire day. I left it to run for about 6-7 hours until I got fed up and had to restart SQL Server to kill that process. Theoretically, since I was the only user in the system, when the process was running and it reached the maximum JVM heap size, Garbage Collector would try to clean the other previous MBOs which it has used, and able to free up some memory.

However, in the production environment, as it happened to our client, if the server load is high, sometimes, the Garbage Collector couldn’t free up memory quick enough, it resulted in the OutOfMemory error and crashed the server.

What can we do about this problem?

To reduce the impact of the Download function, we should set a limit on the number of records the user can download by setting a value to the webclient.maxdownloadrows property. There are already some tech notes by IBM that talked about it.

However, the next question is, once we have set a limit, what is the alternative method for the user to download the data they need? I can think of a few methods like building a simple BIRT report. It allows the users to choose XLS as the output format. We can also setup Application Export function with a flatfile output format. But my favourite option would be using the “Create Report” function. By default, when we create and run a report in the “Preview” mode, it exports the exact same number of columns on the List tab of an application, then we can “Export Data” from that report. The process takes a few clicks, but processing time is usually less than one minute compared to 10 or 20 minutes. That’s a quick win. Also, once the user got used to it, they can extract any data that they like. That means less work for the Maximo Administrator.

The curious case of the MIA work orders?

F5 - Redirect users to a maintenance page

Working in IT, we deal with strange issues all the time. However, every once in a while, something would come up that leaves us scratching our heads for days. One such issue happened to us a few years back. It came back to me recently and this time, I thought to myself I should note it down.

Summary

  • Maximo – TechnologyOne integration error. Work orders went missing.
  • There are no traces of problem. Everything appears to be working fine
  • The problem is due to the F5 Load Balancer returns a Maintenance Page with a HTTP Code 200. This leads Maximo to think the outbound message was received successfully by WebMethods.

The mysterious missing work orders

The issue was first reported to us when the user raised a ticket about missing work orders in TechnologyOne, the Finance Management System used by our client. Without work orders created in TechOne, the users won’t be able to report actual labour time or other costs. Thus, this is considered a high-priority issue.

Web Archiving 101 | Archives and Special Collections
F5 maintenance page for integration should not have HTTP 200 OK status

Integration Background

TechOne is integrated with Maximo using WebMethods, an enterprise integration platform. Unlike direct integration, these types of problems are usually easy to deal with when an enterprise integration tool is used. We simply look at the transaction log, identify the failed transactions and what caused them, fix the issue, and then resubmit the message. All good integration tools have such fundamental capabilities.

In this case, we looked at WebMethods’ transaction history and couldn’t find any traces of the missing work orders. We also spent quite some time digging through all of the log files of each server on the cluster to find any errors but couldn’t find anything relevant. Of course, that is the case because if there is an error, it should have been picked up and the system should raise alarms and email notifications to a few overlapped monitoring channels we set up for this client.

Clueless

On the other hand, when we looked at Maximo’s Message Tracking and log files, everything looked normal with work orders published to WebMethods correctly without interruption. In other words, Maximo said it had sent the message, while WebMethods said it never received anything. This left us in limbo for a few days. And of course, when we had no clue, we did what we application people do best, we blamed the network guys.

The network team couldn’t find anything strange in their logs either. So, we let the issue slip for a few days without any real progress. During this, users kept reporting new missing work orders not knowing that I didn’t really do any troubleshooting work. I was staring at the screen mindlessly all day long.

Light at the end of the tunnel

Then of course, when you stare at something long enough, the problem will reveal itself. With enough work orders reported, it became clear that all of the updates only went missing during a period between 9 to 11 PM regardless of the type of work orders or data entered. When this pattern was mentioned, it didn’t take long for someone to point out that this is usually the time when IT do their Windows patching.

When a server is being updated, IT would set the F5 Load Balancer to re-direct any user requests to a “Site Under Maintenance” page, which makes sense for a normal user accessing a service via the browser. The problem is that when Maximo published an integration message to WebMethods, it received the same web page, which is ok, as it doesn’t process any response. However, the status of the response is HTTP 200 which is not ok in this case. Since it’s an HTTP 200 OK status, Maximo thought the message had been accepted by WebMethods and thus marked it as a successful delivery. WebMethods, on the other hand, never received such a message.

Lesson Learned

The recommendation in this case is to set the status of the Maintenance page to something other than HTTP 2xx. When Maximo receives a status other than 2xx, it marks the message as a delivery failure. This means the administrator shall be notified if monitoring is set up. The failed message will be listed as an error and can be resubmitted using the Message Reprocessing app.

Due to the complex communication chain involved, I never heard back from the F5 team on what exactly was done to rectify the issue. However, from a quick search, it looks like it can be achieved easily by updating the rule in F5.

This same issue recently came back to me, so I had to note it down to my list of common issues with Load Balancer. I think it is also fun enough to deserve a separate post. This is a lengthy story, if you made it this far, I hope at least it will be useful to you at some point.

How to deploy change for Maximo without downtime?

Downtime is costly to the business. As developers, avoiding it can give us a ton of benefits both in terms of efficiency and for personal well-being as well. For example, when making changes that require downtime to a shared environment, I have my freedom back since I don’t have to ask or wait to do it at night. 

With the introduction of Automation Script, most of the business logic and front-end changes we need to push to production nowadays can be done without downtime. Some of them are:

  • Automation Script
  • Escalation
  • Application Design
  • Conditions
  • Workflows

However, Database Configuration changes still need Admin Mode or a restart. 

In recent years, many of us have switched to DBC script to deploy changes. This approach takes more time to prepare than compared to other methods such as using Migration Manager or doing it by hand. It proves to be very reliable and allows faster deployment with much less risk. 

Then many of us probably realized that, for small changes, we can run the DBC script directly when the system is live. But after that, we will still need a quick restart. Doesn’t matter whether it’s a small environment that takes 5 minutes to restart or a massive cluster that needs 30 minutes. A restart is downtime, and any deployment that involves downtime will be treated differently with days or weeks of planning and rounds of approval and review.

For development, a colleague showed me a trick that, instead of a restart, we can just turn on and off Admin Mode. As part of this process, Maximo’s cache is refreshed and the changes will take effect. This works quite well in a few instances. However, this is still a downtime and can’t be used for Production. On a big cluster, in many cases, turning on Admin Mode takes more time than a restart.

My other colleague hinted me a different method and this is what I ended up with. I have been using this for a while now and can report that it is quite useful. Not only my productivity has improved, but it has also proven to be valuable a few times when I don’t have to approach cloud vendors to ask for downtime or restart.

The approach is very simple, when having a change that requires restart, I’ll script it using DBC. If the change is small, I can get away with using Update/Insert SQL to update directly to the configuration tables such as:

  • MAXATTRIBUTE/MAXATTRIBUTECFG
  • MAXOBJECT/MAXOBJECTCFG
  • SYNONYMDOMAIN
  • MAXLOOKUPMAP
  • Etc.

Next, I will create a super complex automation script named refreshmaxcache (with no launch point) below:

That’s it. Every time you deploy a change, all you need to do is call the API script by using the following command to refresh the configuration

https://[MAXIMO_ROOT]/maximo/oslc/script/refreshmaxcache

Note: this is not a bulletproof approach officially recommended by IBM. As such, I suggest if you use it for Production, make sure you understand the change and its impact. I will only use it for small changes in areas that have little or no risk of users writing the data while the change is being applied. For a major deployment, for example, a change to the WORKORDER table, it’s a bad idea to apply it during business hours. For non-production, I don’t see much risk involved. 

A man who doesn’t work at night is a happy person.

How to run SQL query in Maximo without database access?

With the introduction of the Maximo Application Suite, I have had to deal with more and more Maximo environments on the cloud. This often means there is no access to the backend such as the database or the Websphere/Openshift console. Sometimes, to troubleshoot issues, it is critical to be able to run queries on the database. In this post, I will introduce a new approach to accessing the database using Automation Script.

From Maximo version 7.6.0.9, we can now build custom API using automation script. This is a powerful new feature yet it looks to be underutilized by the community.

The first obvious use case is it gives us the freedom to build any API we want without being restricted by the limitations of the Maximo Integration Framework. For example, we can create an API that returns data in CSV or binary format. Or we can use it to upload data and bypass the business layer.

Since it allows us to use the browser to interact with Automation Script, and the script framework has access to all Java functions of the MBO layer, We can exploit it to execute all sorts of weird operations.

https://[MAXIMO_ROOT]/maximo/oslc/script/[SCRIPT_NAME]

In an article I posted a few days ago, I used API script to call a Java function to refresh Maximo sequence and avoid a restart. Using the same approach, we can do a database configuration and deployment without downtime

In this post, I’ll demonstrate how we can use API script to run SELECT, UPDATE, DELETE SQL statements to Maximo database without direct DB access. This can come in handy when DB access is restricted. Of course, we can use MXLoader to achieve the same result. However, this method is a lot more convenient.

Creating an API script is very simple, we just need to create a script without a launch point. Then we can call it by accessing this URL on the browser:

If you’re already logged in and have a session. That is all it takes. Otherwise, to authenticate the request, you can pass in username and password parameters like you normally do when calling the REST API.

https://[MAXIMO_ROOT]/maximo/oslc/script/[SCRIPT_NAME]?_lid=[USERNAME]&_lpwd=[PASSWORD]

To run a SELECT query on the database, I created a script named RUNSQL with the code below:

To use the script to run a query, I typed the SQL query directly in the URL in the sql parameter as below.

https://[MAXIMO_URL]/maximo/oslc/script/runsql?method=SELECT&sql=SELECT top 10 assetnum,description,status FROM asset WHERE status = 'OPERATING'

In this case, the data is returned to the browser in the CSV format.

To execute a query that does not return data (INSERT/UPDATE/DELETE):

https://[MAXIMO_URL]/maximo/oslc/script/runsql?method=DELETE&sql=DELETE maxsession WHERE userid = 'maxadmin'

Note: I have tested this against SQL Server. Haven’t got a chance to test it against DB2 and Oracle database.

How to reset sequence without restarting Maximo?

One error we often have to deal with is an incorrect sequence when adding new data to Maximo. There are many situations which can cause this issue, such as:

  • When loading data using MXLoader, or inserting data directly via SQL
  • Sequence corruption due to unknown cause in Production, probably due to errors caused by cancelled/terminated job
  • Restoring database from a copy or after an upgrade.

When this happens, the user sees an error with a duplicated key value such as

BMXAA4211E - Database error number 2601 has occurred…

The solution is well-documented and straightforward, we just need to find the current maximum ID value used in the table and update the corresponding sequence to use the next value.

For example, if the error occurs with the WORKORDERID field of the WORKORDER table, we can do this SQL update and restart Maximo.

UPDATE maxsequence SET maxreserved = (SELECT max(workorderid) + 1 FROM workorder) WHERE tbname = 'WORKORDER' and name = 'WORKORDERID'

However, I like to avoid restarting Maximo if possible due to some obvious problems such as:

  • I recently had to do a quick deployment which involves uploading some data. For some unknown reasons, loading the data via MXLoader causes random sequence corruption a few times. For this client which has a large cluster, restarting Maximo will require an additional 30-60 minutes downtime.
  • A location data hierarchy update requires me to insert a few thousand new records into the LOCANCESTOR table. I needed to update the sequence to a new value for subsequent data upload via MIF to work. Since it is a cloud environment, if I can avoid a restart, we won’t need to be dependent on the availability of the cloud provider.

To address that problem, the simplest solution I found to hot reset the sequence cache without restarting Maximo is by calling the reset sequence Java function via an automation script. The steps are as follows

  • Create a new script with no launch point:

Whenever we update maxsequence table with a new value and need to reset the cache, just execute the script by calling it via the REST API: 

[MAXIMO_URL]//maximo/oslc/script/runtask?_lid=maxadmin&_lpwd=maxadmin

If it works correctly, you should see something like below.

Executing the reset sequence script by calling it via REST API

No restart during a deployment means we can all go to bed earlier. Best of luck.

UPDATE:  On a clustered environment, I find it doesn’t seem to refresh all the JVMs. Thus, to be sure, we might need to run it on each JVM separately (by accessing the script from the JVM 908x port)

Use Maximo webservice with JSON content

While the JSON API in newer version of Maximo is quite useful, for many integration scenarios, I still prefer to use the old API infrastructure with Publish Channel and Web Service. However, the native format for this feature is XML.

To send or receive JSON with Publish Channel or Enterprise Service, we can translate the default to JSON format before it goes out / into the system. Below is a simple example to set it up.

Setup standard Publish Channel to send XML message

  • Create a new Publish Channel: 
    • Name: ZZSR
    • Object Structure: MXSR
  • Create a new Enterprise Service:
    • Name: ZZSR
    • Object Structure: MXSR
  • Create a new HTTP End Point:  ZZWEBHOOK (To keep it simple, we use webhook.site. See more details here)
  • Create a new External System:
    • Name: ZZTEST
    • End Point: ZZWEBHOOK
    • Set standard queues.
    • In Publish Channels tab, add the ZZSR channel created above.
    • Enable the channel, enable the external system, and enable event listener on the channel
  • To test and verify our Publish channel is working:
    • Create a new Service Request, then save.
    • After less than a minute, Webhook should show a new message received in XML format

To send JSON instead of XML:

  • Update publish channel ZZSR, set Processing Class with value com.ibm.tivoli.maximo.fdmbo.JSONMapperExit
  • Go to the JSON Mapping application, create a new mapping:
    • Name: ZZTEST.ZZSR.OUT (it must follow this exact format <ExtSys>.<PublishChannel>.OUT)
    • Object Structure: MXSR
    • JSON Data: {  “summary”: “Test SR”,  “details”: “Sample Details”,  “assetnum”: “11430” }
  • Save the record. Then go to Properties tab. Enter mapping as follows
  • To test the mapping works, create another SR. And check in Webhook to ensure it now receive a new message in JSON format

To receive JSON on Webservice, use similar step like above, the only key difference is, we have to name the JSON mapping as <ExtSys>.<EntService>.IN. And use a mapping like the example below

« Older posts