Maximo List Archive

This is an archive of the Maximo Yahoo Community. The content of this pages may be a sometimes obsolete so please check post dates.
Thanks to the community owner Christopher Wanko for providing the content.



need help with suspected Automation Script-caused memory leak

From: therron (2018-02-19 22:29)

Maximo 7.6.0.8
I could use some help troubleshooting a memory leak, that I have multiple reasons to believe are caused by Automation Scripts we wrote. Most of these scripts involve rules that we have for the loaning out of various keys. Different keys have different rules, and check different places in Maximo to find that data. For example, it may need to know if the person getting the key is 18 years or older (Person table); it may need to know if the person is qualified to operate a certain piece of machinery (Laborqual table via Labor table); etc.
We created database relationships to handle everything that would be needed for these hundreds of keys, then listed all the variables (~90) when creating the scripts and, using proper Maximo Relationship Path dotted notation, "made it work."
So, yes, the scripts do what we intended them to do. But as time goes on, the Java process continues to grow larger and larger and a couple of times caused an unexpected shutdown. I got our server admins to give me more memory and I reconfigured WebSphere to have 6 GB as suggested in the System Performance Guide.
I've been looking at SystemOut.log to see how many MBOSets and MBOs there are for the various objects. Several are in the single digits; most number in the dozens. However, it looks like most of the objects affected or queried in our scripts number in the thousands, occasionally above 10,000 for the Locations object.
Just for kicks, I decided to try going through the Active Automation Scripts and de-activate them then immediately re-activate them. After that, the counts in SystemOut.log had dropped substantially (though the memory in the Java process was still high; but that could just mean the garbage collector had not run yet, right?).
So: what do I need in my script to force it to release memory when it is done? I've seen other scripts where the Python code itself specified which MBO to use and how to query it, then calls a reset() method. With the way we did our Scripts, we kinda assumed the Scripting framework would handle all that. That was part of the point of having the variables declared as part of the Launch Point, outside of the script. If that even made sense. . .
Help?
Travis Herron


From: maxvil (2018-02-20 14:28)

As a general rule for code in loops if you are running your script as
an repeating script (eg. action script in repeating escalationor or so)
you should pay close attention on releasing mboset resources. For this
the .reset() method is not doing what its name suggests, you should use
instead the .clear() to empty and reuse the set or .close() to dispose
it.
Additionally you should try to navigate different mbo objects by
navigating relationships as mush as possiblke without opening new
connections, there are several things that can fail or provide a
sub-optimal memory usage (apart bugs).
If you post the structure of
your suspect scripts (without the actual private logic of course) and
their engagements conditions I can try to replicate your issue since I'm
just playing with auto-scrips (maximo 7.6.0.6 plus ACM).
Best
Regars
Massimo Villani
Il 19.02.2018 23:29 therron@pcci.edu [MAXIMO] ha
scritto:
> Maximo 7.6.0.8
>
> I could use some help troubleshooting
a memory leak, that I have multiple reasons to believe are caused by
Automation Scripts we wrote. Most of these scripts involve rules that we
have for the loaning out of various keys. Different keys have different
rules, and check different places in Maximo to find that data. For
example, it may need to know if the person getting the key is 18 years
or older (Person table); it may need to know if the person is qualified
to operate a certain piece of machinery (Laborqual table via Labor
table); etc.
>
> We created database relationships to handle
everything that would be needed for these hundreds of keys, then listed
all the variables (~90) when creating the scripts and, using proper
Maximo Relationship Path dotted notation, "made it work."
>
> So, yes,
the scripts do what we intended them to do. But as time goes on, the
Java process continues to grow larger and larger and a couple of times
caused an unexpected shutdown. I got our server admins to give me more
memory and I reconfigured WebSphere to have 6 GB as suggested in the
System Performance Guide.
>
> I've been looking at SystemOut.log to
see how many MBOSets and MBOs there are for the various objects. Several
are in the single digits; most number in the dozens. However, it looks
like most of the objects affected or queried in our scripts number in
the thousands, occasionally above 10,000 for the Locations object.
>
>
Just for kicks, I decided to try going through the Active Automation
Scripts and de-activate them then immediately re-activate them. After
that, the counts in SystemOut.log had dropped substantially (though the
memory in the Java process was still high; but that could just mean the
garbage collector had not run yet, right?).
>
> So: what do I need in
my script to force it to release memory when it is done? I've seen other
scripts where the Python code itself specified which MBO to use and how
to query it, then calls a reset() method. With the way we did our
Scripts, we kinda assumed the Scripting framework would handle all that.
That was part of the point of having the variables declared as part of
the Launch Point, outside of the script. If that even made sense. . .
>
> Help?
>
> Travis Herron
>
> [Non-text portions of this message
have been removed]
>
>

Con Mobile Open 7 GB a 9 euro/4 sett navighi veloce con 7 GB di Internet e hai 200 minuti ed SMS a 12 cent. Passa a Tiscali Mobile! http://tisca.li/OPEN7GBFirma


From: Chris Lawless (2018-02-19 17:32)

Is your script using MXServer connections for its processing?
Original Quoted context has been removed as per policy


From: therron (2018-02-21 15:20)

Can I answer Chris's question with "I don't know?"
I'll post the smallest of the scripts here. I don't think there's any super-secret stuff in here. So if anyone is willing to read it and help, thanks in advance.
So here's some background to help understand what you'd see in this script:
For these keys, the ITEMNUM is a number from 1 - 999, followed by a letter to indicate the copy number (so A means it's the first copy, B means it is the second copy of the same key, C is the third copy, etc.). Then they all end with the letter 'K'. The script I will post only covers keys with a number between 500 - 599. But there's a lot of startswith() and len() methods to make sure we're checking the rules for the right keys, to make sure we're not testing the rules for keys numbered 50 - 59, for example.
PERSON.JOBCODE is what we're using to "classify" people. We operate all levels of education, from preschool through doctorate courses. So in that broad range, we have a few different types of people -- PCC Faculty (those that teach at the college and above), PCA Faculty (those who teach preschool/kindergarten/elementary/middle/high school), Staff (those who work full-time but don't teach), PCC Student (a college student who works to help pay for college), and GA (a graduate assistant, a student working on a Master's degree and working to help pay for it). I think that's all you'll see in there.
Generally, they make PCC Students and GA's get a "permission slip" from their supervisor to borrow one of these keys. The other JobCodes can just have them when they need them, with a few exceptions.
You'll see a lot of code repeat itself in there, with what is supposed to be an interactive dialog with the "cashier" checking out the keys, for if it is after 3:45 PM (when the first-shift "cashier" -- who has been doing this job for YEARS -- leaves work and hands over responsibility to a much-less-experienced person) to ask whether or not the key is being checked out overnight. Unless otherwise specified, keys are due back the day they are checked out. Since our office closes at 4:45 PM, checking something out after 3:45 PM is kind of unusual and often means there's been some sort of emergency and they might need the keys overnight.
Other than that, the script has a few other variables, that were all declared in the creation process in the Script Wizard. They use Relationships, that were created in the Database Configuration app, to find data starting at who is checking out this key (MATUSETRANS.ISSUETO) over to Person, PersonGroup, Labor, or LaborQual as needed. If the variable name starts with "can" or "has" then it's likely going to LaborQual. If the variable name starts with "is" then it is likely going to see if this person is a member of a PersonGroup. A specific example in here is a variable called canUPRIGHT. That takes the ISSUETO value, finds that person, finds that person's labor, and finds that labor's LaborQuals to see if it exists, is Active status, and the expiration date is in the future. So if not canUPRIGHT is None means the person is qualified to operate that machinery.
And I'll admit there's some convoluted syntax in there, like if not <variablename> is None which should be entered as if <variablename> is not None but we tried that at the very beginning and it didn't seem to work -- like Maximo didn't like the syntax. Maybe I'm delusional there, it was a long time ago.
So, here's the code:
from psdi.mbo import MboRemote
from psdi.mbo import MboConstants
from psdi.mbo import MboSetRemote
from psdi.util import MXApplicationException
from psdi.server import MXServer
from java.lang import reflect
from java.lang import String
import time
RIGHT_NOW = MXServer.getMXServer().getDate()
CURRENT_DATE_WITHOUT_TIME = time.strftime("%Y-%m-%d") # This is a constant for the current day that this script is running. It only returns the YYYY-MM-DD in string format.
PCC_AGE_ALLOWED_TO_DRIVE = 18 # This is a constant for the PCC driver policy
CHECKOUT_TIME = '15:45:00'
CURRENT_TIME = time.strftime("%H:%M:%S")
if jobCode != 'CONTRACTOR' and storeloc == 'MTOFFICEKEYS' and issueType == 'ISSUE' and currentKey.startswith('5') and \
currentKey[1].isdigit() and currentKey[2].isdigit() and currentKey[3].isalpha() and currentKey.endswith('K') and len(currentKey) == 5:
if dueDate is None or RIGHT_NOW < dueDate: # dueDate is bound to the pcc_duedate field on the matusetrans table
if issueTo == 'SECURITY': # We might check any key out to them, so we won't consider rules for them either
warngroup = 'matusetrans'
warnkey = 'SecurityCanHaveAnyKey'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
# start with list of unrestricted keys
elif (currentKey.startswith('506') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('507') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('508') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('509') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('510') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('511') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('512') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('513') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('515') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('518') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('519') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('520') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('521') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('522') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5) \
or (currentKey.startswith('523') and currentKey[3].isalpha() and currentKey.endswith('K') and len(
currentKey) == 5):
if jobCode == 'STAFF' or jobCode == 'PCC FACULTY' or jobCode == 'PCA FACULTY':
warngroup = 'matusetrans'
warnkey = 'UnrestrictedKey'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
else:
warngroup = 'matusetrans'
warnkey = 'PermissionSlipRequired'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
elif currentKey[0].isdigit() and currentKey.endswith('K'): # The first character is a number
if len(currentKey) == 5 and currentKey[1].isdigit() and currentKey[2].isdigit(): # This key length should be 5 characters long
if currentKey.startswith('5'):
if currentKey.startswith('50'):
if currentKey.startswith('500') or currentKey.startswith('505'):
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
else: # currentKey.startswith('501') or currentKey.startswith('502') or currentKey.startswith('503')
# or currentKey.startswith('504')
if jobCode == 'STAFF' or jobCode == 'PCC FACULTY' or jobCode == 'PCA FACULTY':
warngroup = 'matusetrans'
warnkey = 'CheckoutAllowed'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
elif jobCode == 'GA':
warngroup = 'matusetrans'
warnkey = 'PermissionSlipRequired'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
else:
errorgroup = 'matusetrans'
errorkey = 'NotStaffFacultyOrGA'
params = [str(currentKey)]
# These are unrestricted keys, handled near the top:
# currentKey.startswith('506') or currentKey.startswith('507') or currentKey.startswith('508')
# or currentKey.startswith('509')
elif currentKey.startswith('51'):
# These are unrestricted keys, handled near the top:
# currentKey.startswith('510') or currentKey.startswith('511') or currentKey.startswith('512') or
# currentKey.startswith('513') or currentKey.startswith('515') or currentKey.startswith('518') or
# currentKey.startswith('519')
if currentKey.startswith('514') or currentKey.startswith('516'):
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
else: # currentKey.startswith('517') # see also keys 166, 179, and 912 -- these should all have the same rules
if not canUPRIGHT is None or not hasActiveAWPT is None:
if not is18OrOlder is None:
if jobCode == 'STAFF' or jobCode == 'PCC FACULTY' or jobCode == 'PCA FACULTY':
warngroup = 'matusetrans'
warnkey = 'CheckoutAllowed'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
else:
warngroup = 'matusetrans'
warnkey = 'PermissionSlipRequired'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
else: # error message for being younger than 18
errorgroup = 'matusetrans'
errorkey = 'AgeUnknownOrYoungerThan18'
params = [str(currentKey)]
else:
warngroup = 'matusetrans'
warnkey = 'NotQualifiedForUpright'
warnparams = [str(currentKey)]
elif currentKey.startswith('52'):
# These are unrestricted keys, handled near the top:
# currentKey.startswith('520') or currentKey.startswith('521') or currentKey.startswith('522') or
# currentKey.startswith('523')
if currentKey.startswith('525') or currentKey.startswith('527'):
if jobCode == 'STAFF' or jobCode == 'PCC FACULTY' or jobCode == 'PCA FACULTY':
warngroup = 'matusetrans'
warnkey = 'CheckoutAllowed'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
elif jobCode == 'GA':
warngroup = 'matusetrans'
warnkey = 'PermissionSlipRequired'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
else:
errorgroup = 'matusetrans'
errorkey = 'NotStaffFacultyOrGA'
params = [str(currentKey)]
elif currentKey.startswith('526'):
if (not isMaintZoneTechPCA is None or not isMaintHVACKitchen is None) and \
(jobCode == 'STAFF' or jobCode == 'PCC FACULTY' or jobCode == 'PCA FACULTY' or jobCode == 'GA'):
warngroup = 'matusetrans'
warnkey = 'CheckoutAllowed'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
elif jobCode == 'STAFF' or jobCode == 'PCC FACULTY' or jobCode == 'PCA FACULTY' or jobCode == 'GA':
warngroup = 'matusetrans'
warnkey = 'PermissionNeededFromMTPCAZTOrKitchenRepr'
warnparams = [str(currentKey)]
if CURRENT_TIME > CHECKOUT_TIME:
def yes():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, True)
def no():
mbo.setFieldFlag('PCC_DUEDATE', MboConstants.REQUIRED, False)

def dflt():
service.yncerror('matusetrans', 'keyPast3:45')
cases = {service.YNC_NULL: dflt, service.YNC_YES: yes, service.YNC_NO: no}
if interactive:
if mbo.isNull('PCC_DUEDATE'):
x = service.yncuserinput()
cases[x]()
else:
errorgroup = 'matusetrans'
errorkey = 'NotStaffFacultyOrGA'
params = [str(currentKey)]
else: # currentKey.startswith('524') or currentKey.startswith('528') or currentKey.startswith('529')
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
elif currentKey.startswith('53'): # currently, numbers 530, 531, 532, 533, 534, 535, 536, 537, 538, and 539 are not used
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
elif currentKey.startswith('54'): # currently, numbers 540, 541, 542, 543, 544, 545, 546, 547, 548, and 549 are not used
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
elif currentKey.startswith('55'): # currently, numbers 550, 551, 552, 553, 554, 555, 556, 557, 558, and 559 are not used
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
elif currentKey.startswith('56'): # currently, numbers 560, 561, 562, 563, 564, 565, 566, 567, 568, and 569 are not used
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
elif currentKey.startswith('57'): # currently, numbers 570, 571, 572, 573, 574, 575, 576, 577, 578, and 579 are not used
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
elif currentKey.startswith('58'): # currently, numbers 580, 581, 582, 583, 584, 585, 586, 587, 588, and 589 are not used
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
else: # currentKey.startswith('59') # currently, numbers 590, 591, 592, 593, 594, 595, 596, 597, 598, and 599 are not used
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
else:
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
else:
errorgroup = 'matusetrans'
errorkey = 'unknownKey'
params = [str(currentKey)]
else:
errorgroup = 'matusetrans'
errorkey = 'illegalFutureDate'
params = [str(RIGHT_NOW)]


From: alexwalter05 (2018-02-20 12:43)

Travis,
Take a look at how you're accessing your MboSet objects. If you are fetching related records off of the implicit script "mbo" variable, then you should be fine. If you are creating new MboSet objects, then be sure to call the close() method on the MboSet when you are finished with it. This will alert the JVM that the object is ready to be garbage collected.

I like to wrap new MboSet calls in a try/catch to ensure that the close() method always gets called in the "finally" block.

Hope that helps,
Alex


From: mark_robbins1 (2018-02-24 14:14)

Hi Travis,
This is an interesting problem.
I have some questions for you.
- What is the Websphere version (including the sub version)?
Some old versions have serious memory leaks. Your relatively low MBO counts point more towards a WAS leak
- You say that 7.6.0.8 is your version. Are any interim fix packs installed?
- What is triggering the automation script e.g. Object level or Attribute level launchpoints?
I assume that the interactive code means that this isn't crontask/interface etc.
"(though the memory in the Java process was still high; but that could just mean the garbage collector had not run yet, right?). "
- How long are the users working in the application before they either logout or switch applications (switching via the Application list - not via the a goto application within an application ?
This is important because Maximo frees up the memory associated with an application when the user leaves the application or when they logout.
"the Java process continues to grow larger and larger "
are you basing that on the memory usage readings in the SystemOut.log file?
"and a couple of times caused an unexpected shutdown. "
Were heapdump files generated?
Have they been analysed?
Have heapdumps been generated and analysed when the memory usage is a problem?
These should tell you what objects are using the memory i.e. where objects are actually being leaked.
"I got our server admins to give me more memory and I reconfigured WebSphere to have 6 GB as suggested in the System Performance Guide. "
It is good to have the level set to the guide's level but it isn't the long term answer. In the short term this is probably delaying the restarts (which is a good thing).
"I've been looking at SystemOut.log to see how many MBOSets and MBOs there are for the various objects. ... . However, it looks like most of the objects affected or queried in our scripts number in the thousands, occasionally above 10,000 for the Locations object. "
IBM have previously stated that 20,000+ should be considered a memory leak.Is there a pattern between the mbo counts and the times that you are worried about the leak ?
Other people gave good advice about reseting/closing mbosets if they were explicitly opened.
- Have you got the DB connection watchdog set to INFO?
That should tell you if code is also leaking connections and potentially mbosets. If there are leaked connections then there will be a log entry, with the stack trace of the code that requested the connection, when the connection is leaked and when the JVM is shutdown.
I would be interested in doing a log/heapdump analysis , with NDAs if necessary, to help identify the cause. I am a highly technical experienced Maximo Support engineer.
If you are interested then please respond and we can figure a way to privately share contact details etc
best regards,
Mark


From: therron (2018-02-26 14:16)

>>WebSphere version
Integrated Solutions Console, 8.5.5.3
Build Number: cf031430.01
Build Date: 7/30/14
This is what shipped with Maximo 7.6.0.0
>>7.6.0.8 is your version. Are any interim fix packs installed?

No. This was a brand-new install at 7.6.0.0, then patched to 7.6.0.8. Nothing else.
>>What is triggering the automation script e.g. Object level or Attribute level launchpoints?
Currently there is one script triggered by Cron Task, a couple more coming soon. That one grabs data out of the Companies object and some of its child objects -- but these objects don't appear to be problematic, based on the MBOSet and MBO counts in the SystemOut.log. The other 30+ scripts in use Object or Attribute launchpoints.
>>How long are the users working in the application before they either logout or switch applications (switching via the Application list - not via the a goto application within an application ?
Now that's interesting! For the 16 scripts that I'm most concerned about being the problem, it's like this: There's one computer where a user would log in and go to that app, Issues and Transfers. They might be logged into that ALL DAY. From that terminal, they'd have no reason to use any other app (though they could if they wanted to). The only other variable in this then would be the 30-minute auto-logout. If there were a customer at least every 25-minutes-or-so, this person might be logged in there for a full work day. Then a logout would occur when someone else takes over that responsibility for the last hour of the day.
However, just looking at Windows Task Manager, even waiting overnight does not show a reduction in the amount of memory used by the Java process running MXServer. The size *should* decrease, right?
>>the Java process continues to grow larger and larger
That's just glancing at Windows Task Manager.
>>and a couple of times caused an unexpected shutdown
Heapdumps? Yes. Analyzed? No. We're working on that. It's new territory for me.
>>IBM have previously stated that 20,000+ should be considered a memory leak
Yeah, I read that. But wouldn't that also have to do with the # of users, and what those users are doing? I don't know that I've ever had more than 25 people logged in at once, and most of those are doing work orders. A small subset would be doing inventory-related things. I'm just about the only person who would do anything with Locations and People. One or two others might occasionally do something with Labor.
>>Other people gave good advice about reseting/closing mbosets if they were explicitly opened.
So, Friday it was discovered that the few times we specifically created or fetched an MBOSet, they WEREN'T reset. I have a part-time worker that's good at Python who I delegated most of the script-writing to, and I thought I had made clear the importance of calling reset(). Apparently not clear enough. So that was implemented back in on Friday, we will see if it makes any difference this week. Those scripts that had that issue are not the ones I suspect are the real problems. Those scripts might only get called 10 times per day; the scripts with the keys being checked out get called hundreds of times per day.
>>Have you got the DB connection watchdog set to INFO?
No, but I will now!


From: mark_robbins1 (2018-02-26 16:42)

Hi Travis,
">>WebSphere version

Integrated Solutions Console, 8.5.5.3
This is what shipped with Maximo 7.6.0.0 "
Best to upgrade a non-production installation to one of the later Websphere versions and try to replicate it.
Be careful to test the other functionality because some Maximo functions break with the later versions. I think Maximo Anywhere has problems with one of the later fix packs
"The other 30+ scripts in use Object or Attribute launchpoints. "
ok so I'm definitely interested becausewe deploy scripts with these but I haven't heard of any leaks.
">>How long are the users working in the application before they either logout or switch applications (switching via the Application list - not via the a goto application within an application ?

Now that's interesting! For the 16 scripts that I'm most concerned about being the problem, it's like this: There's one computer where a user would log in and go to that app, Issues and Transfers. They might be logged into that ALL DAY. From that terminal, they'd have no reason to use any other app (though they could if they wanted to). "
"That's just glancing at Windows Task Manager. "
Task Manager is the wrong thing to look at. Look at the memory entries in the log files e.g. BMXAA7019I - The total memory is 4294967296 and the memory available is 3304145928
">>and a couple of times caused an unexpected shutdown
Heapdumps? Yes. Analyzed? No. We're working on that. It's new territory for me. "
I regularly do heapdump analysis and if you don't know what is normal then you are potentially going to go down a number of blind alleys.
">>IBM have previously stated that 20,000+ should be considered a memory leak
Yeah, I read that. But wouldn't that also have to do with the # of users, and what those users are doing? "
It does relate to what the users are doing but someone working on a workorder will also be loading a location MBO. The mbo counts are a total for all the users that are logged on at that moment in time.

">>Have you got the DB connection watchdog set to INFO?
No, but I will now! "
If you haven't already found it then you may find my article here useful. I don't talk a lot about interpreting the stack traces but happy to discuss privately.
https://www.linkedin.com/pulse/monitoring-number-db-connections-maximo-uses-mark-robbins/ https://www.linkedin.com/pulse/monitoring-number-db-connections-maximo-uses-mark-robbins/?published=t&lipi=urn%3Ali%3Apage%3Ad_flagship3_pulse_read%3BwMRBW4AQQxqlox%2B6URfKCg%3D%3D
the offer still stands if you want me to have a look/analyse free of charge. I work with a lot of installatgions so it is in my interest to identify/resolve an automation script memory leak if there is one.
Happy to do remotely (with my having no control over your PC) and on a conference call.
In terms of who I am - see my blog here.
https://www.linkedin.com/pulse/maximo-support-advice-from-non-ibm-engineer-article-mark-robbins/ https://www.linkedin.com/pulse/maximo-support-advice-from-non-ibm-engineer-article-mark-robbins/
Happy to connect with you on Linkedin if you want to take up my offer..
Mark Robbins


From: mark_robbins1 (2018-02-27 06:36)

Hi Travis,
one thing to do is to graph the memory figures from these log entries for an extended period of time e.g. during the daytime:
BMXAA7019I - The total memory is 4294967296 and the memory available is 3304145928
It needs to be a line graph with one line showing the total memory and another showing the available memory.
then post just the graph here


From: therron (2018-03-19 20:48)

So, Mark, . . .if that offer still stands, how can I get you 16 GB worth of system dumps? We've been trying lots of stuff to figure it out, still haven't gotten it.
Travis Herron


From: mark_robbins1 (2018-03-20 12:34)

Hi Travis.
I can't see a way to send you a private email on this system.
The best way is to connect with me on Linkedin and send me an email so I have your email address.
If you don't use Linkedin then go to the Vetasi website and use the "contact us" link http://www.vetasi.com/contact-us http://www.vetasi.com/contact-us.
Say in the comments that you have been told to ask for me. I will let the sales team to expect the submission and pass it straight through to me.
This may take more than one analysis depending on how your system is configured.
Once we have a private email then I will send you:
what configuration settings to check to ensure you have the right information
what additional files I will need (e.g. JVM logs) what details I will need e.g. server settings
details about how to transfer the files to meThe good news is that I probably won't need all of the 16GB but I can advise on that when we chat.
best regards,
Mark