7x24 Exchange 2014 Spring Magazine - page 44

44
Alarming andAlerting
Monitoringgoeshand inhandwith
alerting.ADCIMMonitoring installation
shouldconsolidateall thedata
necessary inorder toperformalarming
onall relevantaspectsof theoperation
ofadatacenter.Alarmscanbe
triggeredby thesystemwhenany
valuesgooutsideofpre-specified
ranges.Amoresophisticatedalarm
would triggerwhenanyvalueschange
rapidly, even though theyarestillwithin
range.A rapid temperaturechange
could triggeranalarmproactively,
allowing for someextra time to fix the
problembefore theactual valuesgo
outsideofparameters.Analarming
engineshouldnotonlybeable to
generatealarm, butalso toaggregate
alarms issuedbyspecificdevicesor
someof the “islandsofmonitoring,”
typicallyviaSNMP. It shouldbeable to
differentiatedifferent severitiesof
alarms, and in thecaseofan “alarm
storm”where thingsgowrongand
everydevice isscreaming, tosensibly
and logicallydealwithmultiplealarms
without triggeringpanic.
Thealarmcanbemanifested in
differentways: Thesystemcouldsenda
textmessage, oranemail, or turnona
physical alarm. Itcouldcreateand
populateanew trouble ticket inan
issue trackingsystem (ITS). Itmay
request thatanalert isacknowledged
and ifnotacknowledgedwithinapre-
specified time, toescalate thealarm to
another level. Itcanpassonspecific
alarms to thecontrol applications,
securityapplicationsoremergency
responseapplications thatyouhaveset
up todealwithspecificsituations. The
alarmsmayalsobedisplayedon
dashboards, overlaidonphysical views
of thedatacenter.
Alarmsarenotonly reactivebut should
alsobeproactive, identifyingpotential
problems.Oneexampleofproactive
“what if”alarming is theconceptof
“teaming.”Forexample, abranchcircuit
mightbeoperatingwellwithin its
stated limits, but ifanother
correspondingor “teamed”circuitwere
togodown then thepower from the
secondcircuitwouldbe transferred to
the firstcircuit, resulting inanoverload
situation. Forasimpleexample like this,
youcoulddealwith thispotential
overloadbysetting thealarmpointof
redundantcircuitsat, say,40percentof
peak.However,whendealingwith
complex,multi-tier, environments, or
more thanonepotential simultaneous
problem, it isnot sosimple. Insucha
case, a tool shouldbeable tosimulate
the impactof such failuresand issue
alarmsbefore theproblemactually
happenswhichcanbe invaluable for
ensuringahigh-availability
environment. It isbetter topreventa
problem fromhappening in the first
place than towait for theproblem tobe
manifested, and then rush to fix it.
Monitoringcanbeperformedeither
passivelyoractively: apassive
approach receives information from
devicesandsolutionswhich “push”
dataand issuealerts,whileanactive
approach interrogates thedevicesand
solutionsand “pulls” thedata—an
approachwhich iscommonly referred
toas “polling.”As for thespecificbitsof
information—the “datapoints”or simply
“points”—thesevarydependingon the
typeofdevicebeing interrogated. For
example, aUPSmayhave20points
beingmonitored, achillermayhave 12,
aPDUmayhave7, anda temperature
sensorwouldbeasinglepoint.Agood
DCIMMonitoringplatform letsyou
definewhichpointsyouwant to
monitor, how to treat thosepoints, and
howoften theyshouldget stored.
Anotherkeyaspect is theability to
“tune” thepolling frequency: thereare
somecritical elementswhichshouldbe
polledvery frequently, sayevery five
seconds,whileotherscanbepolled
every fiveminutes. Themorepolling, of
course, themore loadon thenetwork,
on thedatabase, andon theequipment
itself.Properconfiguration is required
tomaximize thevaluewithout
overloading.
Scalability
Theabovediscussionbringsup the
questionof scalabilityand robustness
ofyourDCIMMonitoring foundation
layer. Today’s largestdatacentersmay
havehundredsof thousandsofdevices,
whichcan translate intomillionsof
“points”beingmonitored.ADCIM
Monitoringsolution, including the
software itself, themachine(s)onwhich
it runs, thenetworkingenvironment
and thedatabaseengineneed tobe
designed inorder tohandlepotentially
millionsofpoints, day in, dayout.This
includes robustenoughpollingengines,
aswellas theharmonization/conversion
ofunitsandprotocols, alarming, and
storingof thedata. Justasasedanand
an 18-wheeleraredifferent inscale,
even though theyhave thesame
general functionality, there isa
differencebetweenDCIMMonitoring
platforms in termsof scalability. Ifyou
havemultiple, largedatacenters,
beware thatnotallDCIMsolutionsare
designed tohandle thissortof load.
Beforeyouevenbegin to thinkabout
lookingatvariousbellsandwhistles in
datacenter infrastructure
management,makesureyouhave the
base foundation layer—DCIM
Monitoring—installedandhumming.
Otherwiseyou’llbebuildinga
skyscraperwithouta foundation, and
while theantennaon the topof the
buildingor the fancywindowsmay
captureattentiondue to its “wow
factor”—it is the foundationand the
buildingsuperstructure thatyoumust
get right.And that’swhatDCIM
Monitoring isall about. It is the
foundationandsuperstructureofdata
center infrastructuremanagement.
7X24MAGAZINE SPRING2014
SevOnyshkevych isChiefMarketingOfficeratFieldViewSolutions, Inc.He canbe reachedat
1...,34,35,36,37,38,39,40,41,42,43 45,46,47,48,49,50,51,52,53,54,...84