Meeting 2019-03-11


  • Brian
  • Derek
  • Shawn
  • John

RSV replacement status

  • Derek has looked over the results
    • Rates look good
    • Data looks good
  • There might be issues due to the loss of state when the container is restarted
    • Does the first attempt fail, then the second one is more accurate?
    • First collection needs to catch up on all the data
  • What are the next steps to make this production?
    • Shawn is ready to switch as long as the data is validated
  • Monitoring
    • Host “psrsv” on (cert required)
    • Query time:

OSG check_mk

  • Discussion on architecture
    • Monitor with multiple servers? (using check_mk caching agent to avoid overhead)
    • Use the distributed monitoring features?
    • Shawn prefers to keep psetf focused on pS monitoring

Action items

  • Derek: RSV replacement
    • Map the state directory to the host to preserve across restarts (under /usr/local or /var)
    • Run the new RSV collector for 24 hours and make a go/no-go decision on Mar 12, 2019
  • John, along with Huijun/Derek: check_mk
    • Work with Shawn on integration of OSG check_mk site with psetf
  • Derek: Don’t send summary info to bus
  • Shawn: Add pS ETF link onto sand-ci website
  • Derek: Add collector status link to sand-ci website
  • Shawn: Further polishing of the sand-ci OSG architecture document (PR #13)