Meeting 2018-10-08

Attendees

  • Brian
  • Derek
  • Edgar
  • John

Discussed items

perfSonar collectors

  • Two sets of servers
    • One set goes to UChicago
    • One set goes to GRACC
  • Custom written collector (Ilya)
    • Derek: Look at the existing container for details
    • Ilya or Marian Babik would have details
    • Can it be converted to LogStash?

perfSonar RSV components (Edgar)

  • One probe that fetches the content of the JSON which has one line
    • Each entry is a mesh
    • Goes through each mesh finding which host is registered
    • For each host, there’s a probe which visits periodically and pushes contents to MQ
    • If there’s a probe with no meshes, it’s disabled
  • One probe which talks to each host and gets new info (since last run)
    • Pushes info to MQ
  • On GitHub at opensciencegrid/rsv-perfsonar
    • network-monitoring-local-probe:parseJsonUrl()
    • uploader/caller.py
      • Runs in a Python 2.7 environment
      • Parent class
      • Posts data to RabbitMQ
      • Input to uploader is configuration path:
        • Reads from /etc/rsv/metrics/${metricName}.conf
    • uploader/activemquploader.py and uploader/esmonduploader.py are legacy
    • Authentication to hosts is sometimes required: HTTPS with client certs
      • Derek: Does FNAL maintain a list of trusted client certs? Edgar: FNAL trusts any cert. Goal was SSL protection.

Migrate perfSonar RSV to systemd

  • Derek: Will look at the RSV code and estimate the development effort
  • Derek: How many probes? Edgar: On the order of 200. One per perfSonar instance in the wild.
  • Brian: Are they written like python modules or scripts? Edgar: More like scripts and not very much code.
  • Brian’s design #1: Python multiprocessing
    • Implement internal scheduling
    • Likely fewer issues than the systemd approach
  • Brian’s design #2: Move this to a set of systemd-driven services
    • Model after CVMFS Stratum-1 Replicator
    • Templated services and timers
    • Is there a limit to the number of simultaneous python processes?
      • Too high of memory requirement?
    • Repeated RabbitMQ authentications might be too expensive

pS to RabbitMQ discussion (cont’d)

  • Client certificates (LetsEncrypt) auth appears to be a maybe
    • UNL uses LetsEncrypt
    • FNAL uses a *.fnal.gov wildcard
    • TAMU uses a self-signed cert
    • Derek: Can RabbitMQ trust client certs via wildcard?

Periodic reports

  • Weekly service performance reports via email
  • Can use existing GRACC reporting mechanism?
  • Reporting in addition to dashboards

Existing items

  • Explore authentication options for sending site data to RabbitMQ
    • Need Shawn’s input
    • LetsEncrypt client certificates?
    • May still need to automate MQ config changes to add DNs
  • Expanding web presence
    • Shawn: Create github PR and Derek will review
    • Logo, both text and graphical
    • Pointers to existing OSG network docs
  • Complete documentation for each service component
  • How to send check_mk status info from Nebraska to OSG perfSonar ETF?
    • LiveStatus API?

Updated: