Thought i would share another monitor which has been in the works. This monitor connects to the Backup Exec database and ensures that various fields are configured properly to your environment. Rather than looking for errors the script provides an easier method to manage configurations across many servers to try and catch scenarios which lead to backups not being completed properly. The main initial focus was to ensure that the backup Exec Notification system is properly configured and can be trusted. To accomplish this there are quite a few checks concerning email configurations and addresses.
The script uses a micro-Monitor framework which i can outline in another post if there is any interest; however the important thing is that a series of checks are ran against the server, and it will alarm if any one micro monitor fails. currently there are 5 Micro monitors which can be disabled by commenting out the corresponding line in the "Registry" section. You can also use multiple instances of a monitor with different input parameters by adding another entry . The Monitors are as follows:
BESmtpEnabled
Input: None
Ensure that SMTP notifications are enabled on the BE server. Simple straight forward.
BECheckEmail
Input: Email Address
Ensures that every valid backup Job(Active, and scheduled in the future) has the specified email address configured for notifications. As Emails can be set in multiple locations the Micro is able to determine emails set on the job, policy, or selection list. Alarm if any job does not have the specified email.
BEGlobalAlerts
Input: Email Address
Similar to BECheckEmail however ensures an appropriate email is set on select alert categories('Job Cancellation','Job Failed','Tape Alert Error','Media Insert','Job Warning'.) Alarms if one of these does not have the specified address configured to receive notifications
BEJobsOnHold
Input: Max Hold Time
This Micro Ensure that backup jobs do not remain on hold for longer than the hold-time. As jobs eventually may have to be held the micro uses the hold time to allow a grace period to limit false-alarms. However will detect if a job has been forgotten and is no longer backing up data.
BEJobTimeoutSet
Input: None
Ensures All Valid Backup Jobs ( active, and scheduled in the future) have a timeout set.
Note: All of these Micros require a connection the BE database. And a proper connection string must be specified as the first parameter.
Please post any comments, suggestions, and improvements you may have.
Thanks,
Jazz Alyxzander Turner-Baggs
[Edit] : Suppose the code would be helpful
CODE
----------------------------------------------------------------------------------------------------------
Name= "BackupExecConfigurationMon"--.lua
Author= "Jazz Alyxzander Turner-Baggs"
Version= {maj=4,min=32}
Date= "20010-02-22"
--
Description = [[Monitors BackupExec Configurations]]
--
-- The Scripts goal is to double check configurations of all Reoccuring BackupExec jobs. By defining a series of
-- Micro Monitors we can ensure that all new Backupjobs meet the minimal requirements to unsure other monitors will
-- catch the errors
--
-- Micro Monitors: Is SMTP enabled on the Server
-- Do BE Jobs have an specified Email configured for notifications
-- Are there Emails configured on specific global alert catagories
-- Have any jobs been on hold for longer than X minutes?
-- -- Allows monitors to be on hold while on maintence without causing alarms
-- Do BackupJobs have a timeout value set.
--
--
-- Arguments: Various
----------------------------------------------------------------------------------------------------------
------------------------------------------------
-- Vars & Helper Func
------------------------------------------------
tRegisteredTests = {}
tRegisteredObjects = {}
-- Registers Global Objects in the Montor framework
function registerObject(obj)
tRegisteredObjects[obj] = true
end
-- Each Invocation returns the next input argument.
function JLuaArgumentFetchObject()
local i = -1
return function ()
i = i + 1
return GetArgument(i)
end
end
------------------------------------------------
-- Registry
------------------------------------------------
table.insert(tRegisteredTests,"BESmtpEnabled")
--table.insert(tRegisteredTests,"BECheckEmail")
table.insert(tRegisteredTests,"BEJobsOnHold")
table.insert(tRegisteredTests,"BEJobTimeoutSet")
table.insert(tRegisteredTests,"BEGlobalAlertEmailCheck")
------------------------------------------------
-- INM Configuration Section
------------------------------------------------
function OnEnumerate(sFieldToEnum)
Enum = LuaScriptEnumResult()
return Enum
end
function OnConfigure()
Config = LuaScriptConfigurator()
Config:SetAuthor(Author)
Config:SetDescription(Description)
Config:SetMinBuildVersion(5150)
Config:SetScriptVersion(Version.maj,Version.min)
Config:SetEntryPoint("Main")
Initialize(Config)
return Config
end
-- Calls All Global And Micro-Mon Init fucntions
-- Primarily responsible for registering parameter inputs at monitor creation
function Initialize(Config)
GlobalInit(Config)
for __,TestName in pairs(tRegisteredTests) do
if _G['INIT_'..TestName] ~= nil then
_G['INIT_'..TestName](Config)
end
end
end
-- Main entry Point: Runs a Series Of Registered Micro-Monitors
-- returns True if all Montors also are successful
function Main()
tStatus = {}
if GlobalConfig() == false then
return
end
-- Runs All Micro-monitors in order of registration
-- Stores Name and Return String and State
for __,TestName in pairs(tRegisteredTests) do
sReturn,bState = _G['MON_'..TestName]()
table.insert(tStatus,{id=TestName,string=sReturn,state=bState})
end
return_string = ""
return_state = true
-- Build Output and final State from stored data
for __, Result in pairs(tStatus) do
return_string = return_string .. Result.id.. '\t\t'..Result.string .. '\t' .. '\n'
return_state = return_state and Result.state
end
-- Set Script Status and Exit
SetExitStatus(return_string,return_state)
end
-----------------------------------------------
-- Global
-----------------------------------------------
-- Invokes Initialization Functions of global objects
-- registered in the Global Object List
function GlobalInit(Config)
for GlobalObject,val in pairs(tRegisteredObjects) do
if _G['INIT_G_'..GlobalObject] ~= nil then
_G['INIT_G_'..GlobalObject](Config)
end
end
end
-- Invokes Configuration Functions of global objects
-- registered in the Global Object List
function GlobalConfig()
for obj,val in pairs(tRegisteredObjects) do
if _G['G_'..obj]() == false then
return false
end
end
end
-----------------------------------------------
-- Global Objects
-----------------------------------------------
-- Object Used in Micro-Monitors to return appropriate input arguments.
-- Must be called once and only once for every argument specified in the
-- corresponding Init Function.
PM = JLuaArgumentFetchObject()
-----------------------------------------------
DB = TLuaDB()
function INIT_G_DB(Config)
Config:AddArgument("DB Connect String","<Server>@<Database> - Note double slashes are not required",LuaScriptConfigurator.CHECK_NOT_EMPTY)
end
--
function G_DB()
SqlConnectionString = PM()
if (DB:Connect(SqlConnectionString,TLuaDB.CLIENT_SQLSERVER) == false) then
status = "Failed to connect"..DB:GetErrorDescription()
SetExitStatus(status, false)
return false
end
end
-----------------------------------------------
-----------------------------------------------
--Micro-Monitors
-----------------------------------------------
--%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-- BECheckEmail
----------------
-- Monitors That All Active BEJobs have an Email configured for notifications.
-- Requires DB object
-- Alarm Condition: Alarm if Any ACTIVE job does not have email configured to be sent to given Email Address
----------------
registerObject("DB")
function INIT_BECheckEmail(Config)
Config:AddArgument("Email Address","Ensure all active jobs have this email configured for notifications",LuaScriptConfigurator.CHECK_NOT_EMPTY)
end
function MON_BECheckEmail()
status = ""
state = false
tErrors = {}
bResultDetected = false
sEmail = PM()
sQueryJobs =[[Select J.JobName from jobs j JOIN (Select ObjectID,Recipient from Recipients R JOIN NS_MAIL_RECIPIENT M on R.RecipientID=M.Person_oid
where Recipient like ']]..sEmail..[[')R on J.BEJobID= R.ObjectID where J.TaskTypeID = 200 and CURRENT_TIMESTAMP < J.NextDueDate]]
sQueryPolicy =[[Select J.JobName from jobs j JOIN (Select ObjectID,Recipient from Recipients R JOIN NS_MAIL_RECIPIENT M on R.RecipientID=M.Person_oid
where Recipient like ']]..sEmail..[[')R on J.ScriptID= R.ObjectID where J.TaskTypeID = 200 and CURRENT_TIMESTAMP < J.NextDueDate]]
sQuerySelectionList =[[Select J.JobName from jobs j JOIN (Select ObjectID,Recipient from Recipients R JOIN NS_MAIL_RECIPIENT M on R.RecipientID=M.Person_oid
where Recipient like ']]..sEmail..[[')R on J.TaskDefinitionID= R.ObjectID where J.TaskTypeID = 200 and CURRENT_TIMESTAMP < J.NextDueDate]]
-- Queries for all acitve jobs and Policies which do not have the specified email configured
sQueryAllJobs = [[Select J.JobName from Jobs J where J.TaskTypeID = 200 and CURRENT_TIMESTAMP < J.NextDueDate]]
sQueryAcceptableJobs = sQueryJobs .. " UNION " .. sQueryPolicy .. " UNION " .. sQuerySelectionList
sQuery = "(" .. sQueryAllJobs .. ") EXCEPT (" .. sQueryAcceptableJobs .. ")"
-- Execute and Extract all jobs which are erroring
bok = DB:Execute(sQuery)
if ( bok == false) then
status = "Failed" .. DB:GetErrorDescription()
else
if(DB:ResultAvilable() == true) then
bResultDetected = true
while (DB:NextRow() == true) do
jobMissingEmail = DB:GetCol(1)
table.insert(tErrors,jobMissingEmail)
end
end
end
--if no_errors found
if status == "" then
if bResultDetected == false then
status = "No Values returned"
state = false
elseif table.getn(tErrors) == 0 then
status = "OK - Notifications configured for " .. sEmail .. " on all reoccuring jobs"
state = true
else
status = "Jobs missing E-mail("..sEmail..") - "
for key,jobname in pairs(tErrors) do
status = status .. jobname .. ", "
end
end
end
return status,state
end
--==============================================
-- END BECheckEmail
--%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
--%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-- BESmtpEnabled
----------------
-- Ensure BE has Smtp Globally Enabled
-- Requires DB object
-- Alarm Condition: ALARM if BE is SMTP is not Enabled
-- Notes:
----------------
registerObject("DB")
function INIT_BESmtpEnabled(Config)
end
function MON_BESmtpEnabled()
status = ""
state = false
sQuery = [[Select Enabled from NS_SMTPCONFIG]]
bok = DB:Execute(sQuery)
if ( bok == false) then
status = "Failed" .. DB:GetErrorDescription()
else
if(DB:ResultAvilable() == true) then
if (DB:NextRow() == true) then EmailEnabled = DB:GetCol(1) end
end
end
if EmailEnabled == nil then
status = "Error - Unrecognized value from DB "
state = false
elseif EmailEnabled == "1" then
status = "OK"
state = true
else
status = "SMTP Not Enabled on BackupExec Server"
end
return status,state
end
--==============================================
-- END BESmtpEnabled
--%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
--%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-- BEJobsOnHold
----------------
-- Monitors BackupExec DB for Jobs which have been on-hold past a given timeout
-- Requires DB object
-- Alarm Condition: Alarm if Job was first Detected on hold - X minutes ago
-- Notes: Time Accuracy is dependant upon Polling interval.
----------------
registerObject("DB")
function INIT_BEJobsOnHold(Config)
Config:AddArgument("Max Hold Time","Time allowed in minutes for a backupjob to be on hold",LuaScriptConfigurator.CHECK_NOT_EMPTY)
end
function MON_BEJobsOnHold()
status = ""
state = false
tData = {}
KEY = "BkupHold"
Storage = TLuaStorage()
sOA = GetObjectAddress()
iMaxTime = tonumber(PM())*60
sQuery = [[Select jobname,CurrentStatus from jobs where CURRENT_TIMESTAMP < NextDueDate and TaskTypeID = 200]]
bok = DB:Execute(sQuery)
if ( bok == false) then
status = "Failed" .. DB:GetErrorDescription()
else
if(DB:ResultAvilable() == true) then
bResultDetected = true
while (DB:NextRow() == true) do
tData[DB:GetCol(1)] = DB:GetCol(2)
end
end
sJobsOnHold = ""
sExcessiveJobs = ""
bOnHold = false -- True If jobs were foudn on hold
bExcessive = false -- True If jobs have been on hold past timeoutvalue.
for jobname,job_state in pairs(tData) do
currTime = TLuaDateTime():Get()
if job_state == '5' then
bOnHold = true
Storage:CreateItem(sOA .. KEY,jobname,tostring(currTime),string.len(tonumber(currTime)))
LuaItem = Storage:FindItem(sOA .. KEY,jobname);
sTimeHoldWasDetected = LuaItem.m_pData
-- If Job on hold longer than specified timeout.
if currTime - tonumber(sTimeHoldWasDetected)> iMaxTime then
bExcessive = true
sExcessiveJobs = sExcessiveJobs .. jobname .. ", "
else
sJobsOnHold = sJobsOnHold .. jobname ..", "
end
else
-- Cleanup old values.
LuaItem = Storage:DeleteItem(sOA .. KEY,jobname);
end
end
-- Format Return Status
if bExcessive then
status = "Jobs on Hold past Timeout: " .. sExcessiveJobs
state = false
elseif bOnHold then
status = "Jobs on Hold: " .. sJobsOnHold
state = true
else
status = "No Jobs on Hold"
state = true
end
end
return status,state
end
--==============================================
-- END BEJobsOnHold
--%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
--%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-- BEJobTimeoutSet
----------------
-- Ensures All Jobs have a Timeout value Set.
-- Requires DB object
-- AlarmCondition: ALARM if A JOB does NOT have Timeout Value
-- Notes: Uses hardcoded List of Valid Alarms ATM. Need to Think of a method to allow for easy configuration.
-- If possible invert logic such that jobs not listed are treated as valid...
registerObject("DB")
function INIT_BEJobTimeoutSet(Config)
end
function MON_BEJobTimeoutSet()
status = ""
state = false
tData = {}
sError = ""
sQuery = [[SElect J.jobname from jobs J, BEjobs B
where B.EnableAbortJobThreshold = 0 and
J.TaskTypeID = 200 and
CURRENT_TIMESTAMP < J.NextDueDate and
J.BEjobID = B.BEjobID]]
bok = DB:Execute(sQuery)
if ( bok == false) then
status = "DB Error" .. DB:GetErrorDescription()
else
if(DB:ResultAvilable() == true) then
while (DB:NextRow() == true) do
bErrors = true
sError = sError .. DB:GetCol(1) .. ", "
end
end
-- Format Return Status
if bErrors then
status = "Timeout Not Set on: " .. sError
state = false
else
status = "OK"
state = true
end
end
return status,state
end
--==============================================
-- END BEJobTimeoutSet
--%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
--%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-- BEGlobalAlertEmailCheck
----------------
-- Ensures ThatEmail Notifications are Set Globally on a list of Alert Catagories.
-- Requires DB object
-- Notes: Uses hardcoded List of Valid Alarms ATM. Need to Think of a method to allow for easy configuration.
-- If possible invert logic such that jobs not listed are treated as valid...
----------------
registerObject("DB")
function INIT_BEGlobalAlertEmailCheck(Config)
Config:AddArgument("Email Address","Ensure all Select Global Alerts have this address set for notifications",LuaScriptConfigurator.CHECK_NOT_EMPTY)
end
function MON_BEGlobalAlertEmailCheck()
local status = ""
local state = false
local tData = {}
local sError = ""
local bErrors = false
sEmail = PM()
-- Alerts To ensure an Email is configured for.
tValidAlerts = {'Job Cancellation','Job Failed','Tape Alert Error','Media Insert','Job Warning'}
--DB Query to return all Alerts with no Email
sQuery = [[SELECT EventName FROM AlertMapping A,
( Select DISTINCT EventID from AlertMapping
EXCEPT
Select DISTINCT EventID from AlertRecipients where RECIPIENTID =
(SELECT PERSON_OID from NS_MAIL_RECIPIENT where RECIPIENT like ']] .. sEmail ..[[')
) B where A.EventID = B.EventID]]
bok = DB:Execute(sQuery)
if ( bok == false) then
status = "DB Error" .. DB:GetErrorDescription()
else
if(DB:ResultAvilable() == true) then
while (DB:NextRow() == true) do
alert = DB:GetCol(1)
--Find if alert is in the List of Valid alerts
for __, validAlert in pairs(tValidAlerts) do
-- If valid Set Error Flag and append to ErrorString
if alert == validAlert then
bErrors = true
sError = sError .. alert ..", "
end
end
end
end
-- Format Return Status
print (bErrors)
if bErrors == true then
status = "Global Alerts without Email: " .. sError
state = false
else
status = "OK"
state = true
end
end
return status,state
end
--==============================================
-- END BEGlobalAlertEmailCheck
--%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%