Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Networking for the Future of DOE Science William E. Johnston  ESnet Department Head and Senior Scientist wej@es.net, www.es.net This talk is available at www.es.net Energy Sciences Network Lawrence Berkeley National Laboratory Networking for the Future of Science Institute of Computer Science (ICS) of the Foundation for Research and Technology –  Hellas (FORTH)., Feb.  15, 2008
DOE’s Office of Science: Enabling Large-Scale Science The U.S. Dept. of Energy’s Office of Science (SC) is the single  largest supporter of basic research in the physical sciences in the United States , … providing more than 40 percent of total funding … for the Nation’s research programs in high-energy physics, nuclear physics, and fusion energy sciences.  ( http:// www.science.doe.gov )  –  SC funds 25,000 PhDs and PostDocs A primary mission of SC’s National Labs is to  build and operate very large scientific instruments  - particle accelerators, synchrotron light sources, very large supercomputers - that generate massive amounts of data and involve very large, distributed collaborations  ESnet  - the Energy Sciences Network - is an SC program whose  primary mission is to enable the large-scale science  of the Office of Science that depends on : Sharing of massive amounts of data Supporting thousands of collaborators world-wide Distributed data processing Distributed data management Distributed simulation, visualization, and computational steering Collaboration with the US and International Research and Education community
ESnet Defined A national optical circuit infrastructure ESnet shares an optical network on a dedicated national fiber infrastructure with Internet2 (US national research and education (R&E) network) ESnet has exclusive use of a group of 10Gb/s optical channels on this infrastructure A large-scale IP network A tier 1 Internet Service Provider (ISP) (direct connections with all major commercial networks providers) A large-scale science data transport network With multiple high-speed connections to all major US and international research and education (R&E) networks in order to enable large-scale science Providing virtual circuit services specialized to carry the massive science data flows of the National Labs An organization of 30 professionals structured for the service The ESnet organization designes, builds, and operates the network An operating entity with an FY06 budget of about $26.5M
ESnet as a Large-Scale Science Data Transport Network   A few hundred of the 25,000,000 hosts that the Labs connect to every month account for 50% of all ESnet traffic  (which primarily high energy physics, nuclear physics, and climate data at this point) Most of ESnet’s traffic (>85%) goes to and comes from outside of ESnet. This reflects the highly collaborative nature of large-scale science (which is one of the main focuses of DOE’s Office of Science). = the R&E source or destination of ESnet’s top 100 sites (all R&E) (the DOE Lab destination or source of each flow is not shown)
ESnet Provides Global High-Speed Internet Connectivity for DOE Facilities and Collaborators (12/2007) LVK SNLL PNNL LIGO Lab DC Offices MIT/ PSFC AMES LLNL NNSA Sponsored (13+) Joint Sponsored (3) Other Sponsored (NSF LIGO, NOAA) Laboratory Sponsored (6) ~45 end user sites CA*net4   France GLORIAD   (Russia, China) Korea (Kreonet2 Japan (SINet) Australia (AARNet) Canada (CA*net4 Taiwan (TANet2) Singaren commercial peering points PAIX-PA Equinix, etc. ESnet core hubs NEWY JGI LBNL SLAC NERSC SDSC ORNL MREN StarTap Taiwan (TANet2,   ASCC) AU AU SEA CHI-SL Specific R&E network peers Office Of Science Sponsored (22) KAREN/REANNZ ODN Japan Telecom America NLR-Packetnet Abilene/I2 NETL ANL FNAL International (1-10 Gb/s) 10 Gb/s SDN core (I2, NLR) 10Gb/s IP core MAN rings (≥ 10 Gb/s) Lab supplied links OC12 / GigEthernet OC3  (155 Mb/s) 45 Mb/s and less Salt Lake DOE Geography is only representational Other R&E peering points INL YUCCA MT BECHTEL-NV LANL SNLA Allied Signal PANTEX ARM KCP NOAA OSTI ORAU SRS JLAB PPPL BNL NREL GA DOE-ALB DOE GTN NNSA SINet (Japan) Russia (BINP) ELPA WASH CERN (USLHCnet: DOE+CERN funded) GÉANT - France, Germany, Italy, UK, etc  SUNN Abilene SNV1 Equinix ALBU CHIC NASA Ames UNM MAXGPoP NLR AMPATH (S. America) AMPATH (S. America) R&E networks ATLA NSF/IRNC funded IARC PacWave KAREN / REANNZ Internet2 SINGAREN ODN Japan Telecom America Starlight USLHCNet NLR PacWave Abilene Equinix DENV SUNN NASH Internet2 NYSERNet MAN LAN USHLCNet to GÉANT 2700 miles / 4300 km  1625 miles / 2545 km
For Scale The relatively large geographic ESnet scale of  makes it a challenge for a small organization to build, maintain, and operate the network. 2700 miles / 4300 km  1625 miles / 2545 km
ESnet History IP and virtual circuits on a configurable optical infrastructure with at least 5-6 optical channels of 10-100 Gbps each Partnership with Internet2 and US Research & Education community to build a dedicated national optical network ESnet4 2007-2012 IP over 2.5 and 10Gbps SONET Partnership with Qwest to build a national Packet over SONET network and optical channel Metropolitan Area fiber ESnet3 2000-2007 IP over 155 Mbps ATM net Partnership with Sprint to build the first national footprint ATM network ESnet2 1995-2000 56 Kbps to 45 Mbps X.25 ESnet formed to serve the Office of Science ESnet1 1986-1995 56 Kbps microwave and satellite links ESnet0/MFENet ESnet0/MFENet  mid-1970s-1986
Operating  Science Mission  Critical Infrastructure ESnet is a visible and critical piece of DOE science infrastructure if ESnet fails,10s of thousands of DOE and University users know it within minutes if not seconds Requires high reliability and high operational security in the systems that are integral to the operation and management of the network Secure and redundant mail and Web systems are central to the operation and security of ESnet trouble tickets are by email engineering communication by email engineering database interfaces are via Web Secure network access to Hub routers Backup secure telephone modem access to Hub equipment 24x7 help desk and 24x7 on-call network engineer [email_address]  (end-to-end problem resolution)
Disaster Recovery and Stability Remote Engineer partial duplicate infrastructure DNS Remote Engineer partial duplicate infrastructure Remote Engineer The network must be kept available even if, e.g., the West Coast is disabled by a massive earthquake, etc. Duplicate Infrastructure Deploying full replication of the NOC databases and servers and Science Services databases in the NYC Qwest carrier hub Engineers, 24x7 Network Operations Center, generator backed power Spectrum (net mgmt system) DNS (name – IP address translation) Eng database Load database Config database Public and private Web E-mail (server and archive) PKI cert. repository and revocation lists collaboratory authorization service Reliable operation of the network involves remote Network Operation Centers (4)  replicated support infrastructure generator backed UPS power at all critical network and infrastructure locations high physical security for all equipment non-interruptible core -  ESnet core operated without interruption  through N. Calif. Power blackout of 2000 the 9/11/2001 attacks, and the Sept., 2003 NE States power blackout Remote Engineer LBNL PPPL BNL AMES TWC ATL HUB SEA HUB ALB HUB NYC HUBS DC HUB ELP HUB CHI HUB SNV HUB
ESnet is a Highly Reliable Infrastructure “ 5 nines” (>99.995%) “ 3 nines” (>99.5%) “ 4 nines” (>99.95%) Dually connected sites Note: These availability measures are only for ESnet infrastructure, they do not include site-related problems. Some sites, e.g. PNNL and LANL, provide circuits from the site to an ESnet hub, and therefore the ESnet-site demarc is at the ESnet hub (there is no ESnet equipment at the site. In this case, circuit outages between the ESnet equipment and the site are considered site issues and are not included in the ESnet availability metric.
ESnet is An Organization Structured for the Service Applied R&D for new network services (Circuit services and end-to-end monitoring) Network operations and user support (24x7x365, end-to-end problem resolution Network engineering, routing and network services, WAN security Science collaboration services (Public Key Infrastructure certification authorities, AV conferencing, email lists, Web services) 3.5 5.5 3.0 7.4 8.3 5.5 Internal infrastructure,  disaster recovery, security Management, accounting, compliance  Deployment and WAN maintenance
ESnet FY06 Budget is Approximately $26.6M Total expenses:  $26.6M Target carryover: $1.0M Management and compliance: $0.7M Collaboration  services: $1.6 Internal infrastructure, security, disaster recovery: $3.4M Operations:  $1.1M Engineering & research: $2.9M WAN  equipment:  $2.0M Special projects  (Chicago and LI MANs): $1.2M Total funds:  $26.6M SC R&D: $0.5M Carryover: $1M SC Special Projects:  $1.2M Circuits & hubs: $12.7M SC funding: $20.1M Other DOE: $3.8M Approximate Budget Categories
A Changing Science Environment is the Key Driver of the Next Generation ESnet Large-scale collaborative science – big facilities, massive data, thousands of collaborators – is now a significant aspect of the Office of Science (“SC”) program SC science community is almost equally split between Labs and universities SC facilities have users worldwide Very large international (non-US) facilities (e.g. LHC and ITER) and international collaborators are now a key element of SC science Distributed systems for data analysis, simulations, instrument operation, etc., are essential and are now common (in fact dominate in the data analysis systems that now generate 50% of all ESnet traffic)
LHC will be the largest scientific experiment and generate the most data the scientific community has ever tried to manage. The data management model involves a world-wide collection of data centers that store, manage, and analyze the data and that are integrated through network connections with typical speeds in the 10+ Gbps range. closely coordinated and interdependent distributed systems that must have predictable intercommunication for effective functioning [ICFA SCIC]
FNAL Traffic is Representative of all CMS Traffic Accumulated data (Terabytes) received by CMS Data Centers (“tier1” sites) and many analysis centers (“tier2” sites) during the past 12 months (15 petabytes of data)  [LHC/CMS]  FNAL
When A Few Large Data Sources/Sinks Dominate Traffic it is Not Surprising that Overall Network Usage Follows the Patterns of the Very Large Users - This Trend Will Reverse in the Next Few Weeks as the Next Round of LHC Data Challenges Kicks Off FNAL Outbound Traffic
What Does this Aggregate Traffic Look like at a Site FNAL Outbound CMS Traffic Max= 1064 MBy/s (8.5 Gb/s), Average = 394 MBy/s (3.2 Gb/s) Mbytes/sec 8 Gb/s
Total ESnet Traffic Exceeds 2 Petabytes/Month 2.7 PBytes in July 2007 1 PBytes in April 2006 ESnet traffic historically has increased 10x every 47 months Overall traffic tracks the very large science use of the network
Large-Scale Data Analysis Systems (Typified by the LHC) Generate Several Types of Requirements The  systems are data intensive and high-performance , typically moving terabytes a day for months at a time  The  system are high duty-cycle , operating most of the day for months at a time in order to meet the requirements for data movement The  systems are widely distributed  – typically spread over continental or inter-continental distances Such  systems depend on network performance and availability , but these characteristics cannot be taken for granted, even in well run networks, when the multi-domain network path is considered The applications  must be able to get guarantees from the network  that there is adequate bandwidth to accomplish the task at hand The applications  must be able to get information from the network  that allows graceful failure and auto-recovery and adaptation to unexpected network conditions that are short of outright failure This slide drawn from  [ICFA SCIC]
Enabling Large-Scale Science These requirements are generally true for systems with widely distributed components in order to be reliable and consistent in performing the sustained, complex tasks of large-scale science To meet these requirements networks must provide high capacity  communication capabilities that are service-oriented: configurable, schedulable, predictable, reliable, and informative and the network and its services must be scalable
ESnet Response to the Requirements For capacity: A new network architecture and implementation strategy Provide two networks: IP and circuit-oriented Science Data Netework Rich and diverse network topology  for flexible management and high reliability Dual connectivity at every level  for all large-scale science sources and sinks A partnership with the US research and education community  to build a shared, large-scale, R&E managed optical infrastructure For service-oriented capabilities Develop and deploy a virtual circuit service   Develop the service cooperatively with the networks that are intermediate between DOE Labs and major collaborators to ensure and-to-end interoperability Develop and deploy service-oriented, user accessible network monitoring systems Provide collaboration support services: Public Key Certificate infrastructure, video/audio/data conferencing
Networking for Large-Scale Science It is neither financially reasonable, nor necessary, to address this tremendous growth in traffic by growing the IP network The routers in the IP network are capable of routing (determining the path to the destination) of 6,000,000 packets a second, and this requires expensive hardware (~$1.5M/router) The paths of the large-scale science traffic are quite static - routing is not needed Therefore, ESnet is obtaining the required huge increase in network capacity by using a switched / circuit-oriented network (the switches are about 20-25% of the cost of routers)
ESnet4 Network Architecture ESnet sites ESnet IP hubs Metro area rings (MANs) Circuit connections to other science networks (e.g. USLHCNet) ESnet SDN hubs The multiple interconnected rings provide high reliability IP network physical infrastructure is packet over un-switched Ethernet Science Data Network (SDN) physical infrastructure is switched Ethernet circuits Core-to-site-to-core dual connection for those sites not on metro area rings Other IP networks
Approach to Satisfying Capacity Requiremetns  - a Scalable Optical Infrastructure b Houston Clev. Houston Kansas City Boston Sunnyvale Seattle San Diego Albuquerque El Paso Chicago New York City Washington DC Atlanta Denver Los Angeles Nashville 20G 20G 20G 20G 20G 20G 20G ESnet 4 Backbone Optical Circuit Configuration (December, 2008) DC 10 Gb/s SDN core (NLR) 10/2.5 Gb/s IP core (QWEST) 10 Gb/s IP core (Level3) 10 Gb/s SDN core (Level3) MAN rings (≥ 10 G/s) Lab supplied links Future ESnet Hub ESnet Hub (IP and SDN devices)
Typical ESnet and Internet2 Core Hub fiber east fiber west fiber north/south Ciena CoreDirector grooming device Infinera DTN Internet2 ESnet support devices: measurement out-of-band access monitoring security support devices: measurement out-of-band access monitoring …… . Network Testbed Implemented as a Optical Overlay dynamically allocated and routed waves (future) ESnet metro-area networks Internet2/Level3/Infinera National Optical Infrastructure optical interface to  R&E Regional Nets T640 various equipment and experimental control plane management systems BMM BMM M320 IP core SDN core switch
Conceptual Architecture of the Infinera DTN Digital Terminal 100 Gb/s 100 Gb/s BMM 100 Gb/s DLM 10x10 Gb/s 10 Gb/s Band Mux Module (BMM) Multiplexes 100Gb/s bands onto 400 Gb/s or 800 Gb/s line Digital Line Module (DLM) 100Gb/s DWDM line module (10    x 10Gb/s) Integrated digital switch enables add/drop & grooming at 2.5Gb/s (ODU-1) granularity Tributary Adapter Module (TAM) Provides customer facing optical interface Supports variety of data rates and service types  Up to 5 TAMs per DLM Tributary Optical Module (TOM) Pluggable client side optical module (XFP, SFP) Supports 10GE, OC48&OC192 SONET Optical Supervisory Channel (OSC) Management Plane Traffic—Includes traffic from the remote management systems to access network elements for the purpose of managing them Control Plane Traffic—GMPLS routing and signaling control protocol traffic Datawire Traffic—Customer management traffic by interconnecting customer’s 10Mbps Ethernet LAN segments at various sites through AUX port interfaces 100 Gb/s Notes: This conceptual architecture is based on W. Johnston’s understanding of the Infinera DTN All channels are full duplex, which is not made explicit here The switch fabric operates within a DLM and also interconnects the DLMs The Management Control Module is not shown intra- and inter-DLM switch fabric WAN Control Processor OE/EO TAM TOM TOM OSC
ESnet Metropolitan Area Network Ring Architecture for High Reliability Sites MAN fiber ring: 2-4 x 10 Gbps channels provisioned initially, with expansion capacity to 16-64 Site router Site gateway router ESnet production IP core Site Large Science Site ESnet managed virtual circuit services tunneled through the IP backbone Virtual Circuits to Site Site LAN Site edge router IP core router ESnet SDN core ESnet IP core hub ESnet switch ESnet managed λ  / circuit services SDN circuits to site systems ESnet MAN switch Virtual Circuit to Site SDN core west SDN core east IP core west IP core east ESnet production IP service Independent port card supporting multiple 10 Gb/s line interfaces T320 US LHCnet switch SDN core switch SDN core switch US LHCnet switch MAN site switch MAN site switch
Typical Medium Size ESnet4 Hub DC power controllers 10 Gbps network tester Cisco 7609, Science Data Network  switch (about $365K, list) Juniper M320, core IP router, (about $1.65M, list) Juniper M7i, IP peering router, (about $60K, list) secure terminal server (telephone modem access) local Ethernet DC power controllers
Note that the major ESnet  sites are now directly on the ESnet “core” network Denver Seattle Sunnyvale LA San Diego Chicago Raleigh Jacksonville KC El Paso Albuq. Tulsa Clev. Boise Wash. DC Salt Lake City Portland Baton Rouge Houston Pitts. NYC Boston Philadelphia Indianapolis (>1   ) Atlanta Nashville OC48 (0) (1) 5  4  4  4  4  4  4  5  5  5  3  4  4  5  5  5  5  5  4  (7) (17) (19) (20) (22) (23) (29) (28) (8) (16) (32) (2) (4) (5) (6) (9) (11) (13) (25) (26) (10) (12) (27) (14) (24) (15) (30) 3  3  (3) (21) West Chicago MAN Long Island MAN bandwidth into and out of FNAL is equal to, or greater, than the ESnet core bandwidth Layer 1 optical nodes at eventual ESnet Points of Presence ESnet IP switch only hubs ESnet IP switch/router hubs ESnet SDN switch hubs Layer 1 optical nodes not currently in ESnet plans Lab site ESnet IP core (1  ) ESnet Science Data Network core ESnet SDN core, NLR links (existing) Lab supplied link LHC related link MAN link International IP Connections Internet2 circuit number (20) BNL 32 AoA, NYC  USLHCNet JLab ELITE ODU  MATP Wash., DC FNAL 600 W. Chicago Starlight ANL USLHCNet Atlanta MAN 180 Peachtree  56 Marietta (SOX) ORNL Wash., DC  Houston Nashville San Francisco Bay Area MAN LBNL SLAC JGI LLNL SNLL NERSC
ESnet4 IP + SDN, 2011 Configuration Denver Seattle Sunnyvale LA San Diego Chicago Raleigh Jacksonville KC El Paso Albuq. Tulsa Clev. Boise Wash. DC Salt Lake City Portland Baton Rouge Houston Pitts. NYC Boston Philadelphia Indianapolis (>1   ) Atlanta Nashville OC48 (0) (1) 5  4  4  4  4  4  4  5  5  5  3  4  4  5  5  5  5  5  4  (7) (17) (19) (20) (22) (23) (29) (28) (8) (16) (32) (2) (4) (5) (6) (9) (11) (13) (25) (26) (10) (12) (27) (14) (24) (15) (30) 3  3  (3) (21) Layer 1 optical nodes at eventual ESnet Points of Presence ESnet IP switch only hubs ESnet IP switch/router hubs ESnet SDN switch hubs Layer 1 optical nodes not currently in ESnet plans Lab site ESnet IP core (1  ) ESnet Science Data Network core ESnet SDN core, NLR links (existing) Lab supplied link LHC related link MAN link International IP Connections Internet2 circuit number (20)
Cleveland Europe (GÉANT) Asia-Pacific New York Chicago Washington   DC Atlanta CERN   (30 Gbps) Seattle Albuquerque Australia San Diego LA Denver South America (AMPATH) South America (AMPATH) Canada (CANARIE) CERN   (30 Gbps) Canada (CANARIE) Asia-Pacific Asia Pacific GLORIAD   (Russia and China) Boise Houston Jacksonville Tulsa Boston Science Data Network Core IP Core Kansas City Australia Core network fiber path is ~ 14,000 miles / 24,000 km Sunnyvale El Paso ESnet4 Planed Configuration 40-50 Gbps in 2009-2010, 160-400 Gbps in 2011-2012   1625 miles / 2545 km  2700 miles / 4300 km  Production IP core ( 10Gbps ) ◄ SDN core ( 20-30-40Gbps )  ◄ MANs (20-60 Gbps) or backbone loops for site access International connections IP core hubs Primary DOE Labs SDN (switch) hubs High speed cross-connects with Ineternet2/Abilene Possible hubs
Multi-Domain Virtual Circuits as a Service Guaranteed bandwidth service User specified bandwidth - requested and managed in a Web Services framework Traffic isolation and traffic engineering Provides for high-performance, non-standard transport mechanisms that cannot co-exist with commodity TCP-based transport Enables the engineering of explicit paths to meet specific requirements e.g. bypass congested links, using lower bandwidth, lower latency paths Secure connections The circuits are “secure” to the edges of the network (the site boundary) because they are managed by the control plane of the network which is isolated from the general traffic End-to-end (cross-domain) connections between Labs and collaborating institutions Reduced cost of handling high bandwidth data flows Highly capable routers are not necessary when every packet goes to the same place Use lower cost (factor of 5x) switches to relatively route the packets
Virtual Circuit Service Functional Requirements Support user/application VC reservation requests Source and destination of the VC Bandwidth, latency, start time, and duration of the VC Traffic characteristics (e.g. flow specs) to identify traffic designated for the VC  Manage allocations of scarce, shared resources Authentication to prevent unauthorized access to this service Authorization to enforce policy on reservation/provisioning Gathering of usage data for accounting Provide virtual circuit setup and teardown mechanisms and security Widely adopted and standard protocols (such as MPLS and GMPLS) are well understood within a single domain Cross domain interoperability is the subject of ongoing, collaborative development secure and-to-end connection setup is provided by the network control plane accommodate heterogeneous circuit abstraction (e.g.. MPLS, GMPLS, VLANs, VCAT/LCAS) Enable the claiming of reservations Traffic destined for the VC must be differentiated from “regular” traffic Enforce usage limits Per VC admission control polices usage, which in turn facilitates guaranteed bandwidth Consistent per-hop QoS throughout the network for transport predictability
ESnet Virtual Circuit Service: OSCARS ( O n-demand  S ecured  C ircuits and  A dvanced  R eservation  S ystem) User Application Software Architecture  (see ref. 9) Web-Based User Interface (WBUI) will prompt the user for a username/password (or Grid proxy cert.) and forward it to the AAAS. Authentication, Authorization, and Auditing Subsystem (AAAS) will handle access, enforce policy, and generate usage records. Bandwidth Scheduler Subsystem (BSS) will track reservations and map the state of the network (present and future). Path Setup Subsystem (PSS) will setup and teardown the on-demand paths (LSPs). User Instructions to routers and switches to setup/teardown LSPs Web-Based User Interface Authentication, Authorization, And Auditing Subsystem Bandwidth Scheduler Subsystem Path Setup Subsystem Reservation Manager User app request via AAAS User feedback Human User User request via WBUI
Environment of Science is Inherently Multi-Domain End points will be at independent institutions – campuses or research institutes - that are served by ESnet, Abilene, GÉANT, and their regional networks Complex inter-domain issues – typical circuit will involve five or more domains - of necessity this involves collaboration with other networks For example, a connection between FNAL and DESY involves five domains, traverses four countries, and crosses seven time zones FNAL (AS3152) [US] ESnet (AS293) [US] GEANT (AS20965) [Europe] DFN (AS680) [Germany] DESY (AS1754) [Germany]
Motivation : For a virtual circuit service to be successful, it must Be end-to-end, potentially crossing several administrative domains Have consistent network service guarantees throughout the circuit Observation: Setting up an intra-domain circuit is easy compared with coordinating an inter-domain circuit Issues: Cross domain authentication  and  authorization A mechanism to authenticate and authorize a bandwidth on-demand (BoD) circuit request must be agreed upon in order to automate the process Multi-domain Acceptable Use Policies (AUPs) Domains may have very specific AUPs dictating what the BoD circuits can be used for and where they can transit/terminate Domain specific service offerings Domains must have way to guarantee a certain level of service for BoD circuits Security concerns Are there mechanisms for a domain to protect itself (e.g. RSVP filtering) Mostly “no” - so there will not be signaling across domain boundaries Inter-domain Reservations: A Hard Problem
Environment of Science is Inherently Multi-Domain In order to set up end-to-end circuits across multiple domains without violating security or allocation management policy of any of the domains, the process of setting up end-to-end circuits is handled by a broker that Negotiates for bandwidth that is available for this purpose Requests that the domains each establish the “meet-me” state so that the circuit in one domain can connect to a circuit in another domain (thus maintaining domain control over the circuits) FNAL (AS3152) [US] ESnet (AS293) [US] GEANT (AS20965) [Europe] DFN (AS680) [Germany] DESY (AS1754) [Germany] broker Local domain circuit manager Local domain circuit manager Local domain circuit manager Local domain circuit manager Local domain circuit manager
OSCARS Update OSCARS is the ESnet control plane mechanism Establishes the circuit path Transport is then via MPLS and VLANs Completed porting OSCARS from Perl to Java to better support web-services This is now the common code base for OSCARS and I2's BRUW Paper on OSCARS was accepted by the IEEE GridNets Collaborative efforts Working with I2 and DRAGON to support interoperability between OSCARS/BRUW and DRAGON currently in the process of installing an instance of DRAGON in ESnet Working with I2, DRAGON, and TeraPaths (Brookhaven Lab) to determine an appropriate interoperable AAI (authentication and authorization infrastructure) framework (this is in conjunction with GEANT2's JRA5) Working with DICE Control Plane group to determine schema and methods of distributing topology and reachability information DICE=Internet2, ESnet, GEANT, CANARIE/UCLP; see  http:// www.garr.it/dice/presentation.htm  for presentations from the last meeting Working with Tom Lehman (DRAGON), Nagi Rao (USN), Nasir Ghani (Tennessee Tech) on multi-level, multi-domain hybrid network performance measurements (funded by OASCR)

More Related Content

[.ppt]

  • 1. Networking for the Future of DOE Science William E. Johnston ESnet Department Head and Senior Scientist wej@es.net, www.es.net This talk is available at www.es.net Energy Sciences Network Lawrence Berkeley National Laboratory Networking for the Future of Science Institute of Computer Science (ICS) of the Foundation for Research and Technology – Hellas (FORTH)., Feb. 15, 2008
  • 2. DOE’s Office of Science: Enabling Large-Scale Science The U.S. Dept. of Energy’s Office of Science (SC) is the single largest supporter of basic research in the physical sciences in the United States , … providing more than 40 percent of total funding … for the Nation’s research programs in high-energy physics, nuclear physics, and fusion energy sciences. ( http:// www.science.doe.gov ) – SC funds 25,000 PhDs and PostDocs A primary mission of SC’s National Labs is to build and operate very large scientific instruments - particle accelerators, synchrotron light sources, very large supercomputers - that generate massive amounts of data and involve very large, distributed collaborations ESnet - the Energy Sciences Network - is an SC program whose primary mission is to enable the large-scale science of the Office of Science that depends on : Sharing of massive amounts of data Supporting thousands of collaborators world-wide Distributed data processing Distributed data management Distributed simulation, visualization, and computational steering Collaboration with the US and International Research and Education community
  • 3. ESnet Defined A national optical circuit infrastructure ESnet shares an optical network on a dedicated national fiber infrastructure with Internet2 (US national research and education (R&E) network) ESnet has exclusive use of a group of 10Gb/s optical channels on this infrastructure A large-scale IP network A tier 1 Internet Service Provider (ISP) (direct connections with all major commercial networks providers) A large-scale science data transport network With multiple high-speed connections to all major US and international research and education (R&E) networks in order to enable large-scale science Providing virtual circuit services specialized to carry the massive science data flows of the National Labs An organization of 30 professionals structured for the service The ESnet organization designes, builds, and operates the network An operating entity with an FY06 budget of about $26.5M
  • 4. ESnet as a Large-Scale Science Data Transport Network A few hundred of the 25,000,000 hosts that the Labs connect to every month account for 50% of all ESnet traffic (which primarily high energy physics, nuclear physics, and climate data at this point) Most of ESnet’s traffic (>85%) goes to and comes from outside of ESnet. This reflects the highly collaborative nature of large-scale science (which is one of the main focuses of DOE’s Office of Science). = the R&E source or destination of ESnet’s top 100 sites (all R&E) (the DOE Lab destination or source of each flow is not shown)
  • 5. ESnet Provides Global High-Speed Internet Connectivity for DOE Facilities and Collaborators (12/2007) LVK SNLL PNNL LIGO Lab DC Offices MIT/ PSFC AMES LLNL NNSA Sponsored (13+) Joint Sponsored (3) Other Sponsored (NSF LIGO, NOAA) Laboratory Sponsored (6) ~45 end user sites CA*net4 France GLORIAD (Russia, China) Korea (Kreonet2 Japan (SINet) Australia (AARNet) Canada (CA*net4 Taiwan (TANet2) Singaren commercial peering points PAIX-PA Equinix, etc. ESnet core hubs NEWY JGI LBNL SLAC NERSC SDSC ORNL MREN StarTap Taiwan (TANet2, ASCC) AU AU SEA CHI-SL Specific R&E network peers Office Of Science Sponsored (22) KAREN/REANNZ ODN Japan Telecom America NLR-Packetnet Abilene/I2 NETL ANL FNAL International (1-10 Gb/s) 10 Gb/s SDN core (I2, NLR) 10Gb/s IP core MAN rings (≥ 10 Gb/s) Lab supplied links OC12 / GigEthernet OC3 (155 Mb/s) 45 Mb/s and less Salt Lake DOE Geography is only representational Other R&E peering points INL YUCCA MT BECHTEL-NV LANL SNLA Allied Signal PANTEX ARM KCP NOAA OSTI ORAU SRS JLAB PPPL BNL NREL GA DOE-ALB DOE GTN NNSA SINet (Japan) Russia (BINP) ELPA WASH CERN (USLHCnet: DOE+CERN funded) GÉANT - France, Germany, Italy, UK, etc SUNN Abilene SNV1 Equinix ALBU CHIC NASA Ames UNM MAXGPoP NLR AMPATH (S. America) AMPATH (S. America) R&E networks ATLA NSF/IRNC funded IARC PacWave KAREN / REANNZ Internet2 SINGAREN ODN Japan Telecom America Starlight USLHCNet NLR PacWave Abilene Equinix DENV SUNN NASH Internet2 NYSERNet MAN LAN USHLCNet to GÉANT 2700 miles / 4300 km 1625 miles / 2545 km
  • 6. For Scale The relatively large geographic ESnet scale of makes it a challenge for a small organization to build, maintain, and operate the network. 2700 miles / 4300 km 1625 miles / 2545 km
  • 7. ESnet History IP and virtual circuits on a configurable optical infrastructure with at least 5-6 optical channels of 10-100 Gbps each Partnership with Internet2 and US Research & Education community to build a dedicated national optical network ESnet4 2007-2012 IP over 2.5 and 10Gbps SONET Partnership with Qwest to build a national Packet over SONET network and optical channel Metropolitan Area fiber ESnet3 2000-2007 IP over 155 Mbps ATM net Partnership with Sprint to build the first national footprint ATM network ESnet2 1995-2000 56 Kbps to 45 Mbps X.25 ESnet formed to serve the Office of Science ESnet1 1986-1995 56 Kbps microwave and satellite links ESnet0/MFENet ESnet0/MFENet mid-1970s-1986
  • 8. Operating Science Mission Critical Infrastructure ESnet is a visible and critical piece of DOE science infrastructure if ESnet fails,10s of thousands of DOE and University users know it within minutes if not seconds Requires high reliability and high operational security in the systems that are integral to the operation and management of the network Secure and redundant mail and Web systems are central to the operation and security of ESnet trouble tickets are by email engineering communication by email engineering database interfaces are via Web Secure network access to Hub routers Backup secure telephone modem access to Hub equipment 24x7 help desk and 24x7 on-call network engineer [email_address] (end-to-end problem resolution)
  • 9. Disaster Recovery and Stability Remote Engineer partial duplicate infrastructure DNS Remote Engineer partial duplicate infrastructure Remote Engineer The network must be kept available even if, e.g., the West Coast is disabled by a massive earthquake, etc. Duplicate Infrastructure Deploying full replication of the NOC databases and servers and Science Services databases in the NYC Qwest carrier hub Engineers, 24x7 Network Operations Center, generator backed power Spectrum (net mgmt system) DNS (name – IP address translation) Eng database Load database Config database Public and private Web E-mail (server and archive) PKI cert. repository and revocation lists collaboratory authorization service Reliable operation of the network involves remote Network Operation Centers (4) replicated support infrastructure generator backed UPS power at all critical network and infrastructure locations high physical security for all equipment non-interruptible core - ESnet core operated without interruption through N. Calif. Power blackout of 2000 the 9/11/2001 attacks, and the Sept., 2003 NE States power blackout Remote Engineer LBNL PPPL BNL AMES TWC ATL HUB SEA HUB ALB HUB NYC HUBS DC HUB ELP HUB CHI HUB SNV HUB
  • 10. ESnet is a Highly Reliable Infrastructure “ 5 nines” (>99.995%) “ 3 nines” (>99.5%) “ 4 nines” (>99.95%) Dually connected sites Note: These availability measures are only for ESnet infrastructure, they do not include site-related problems. Some sites, e.g. PNNL and LANL, provide circuits from the site to an ESnet hub, and therefore the ESnet-site demarc is at the ESnet hub (there is no ESnet equipment at the site. In this case, circuit outages between the ESnet equipment and the site are considered site issues and are not included in the ESnet availability metric.
  • 11. ESnet is An Organization Structured for the Service Applied R&D for new network services (Circuit services and end-to-end monitoring) Network operations and user support (24x7x365, end-to-end problem resolution Network engineering, routing and network services, WAN security Science collaboration services (Public Key Infrastructure certification authorities, AV conferencing, email lists, Web services) 3.5 5.5 3.0 7.4 8.3 5.5 Internal infrastructure, disaster recovery, security Management, accounting, compliance Deployment and WAN maintenance
  • 12. ESnet FY06 Budget is Approximately $26.6M Total expenses: $26.6M Target carryover: $1.0M Management and compliance: $0.7M Collaboration services: $1.6 Internal infrastructure, security, disaster recovery: $3.4M Operations: $1.1M Engineering & research: $2.9M WAN equipment: $2.0M Special projects (Chicago and LI MANs): $1.2M Total funds: $26.6M SC R&D: $0.5M Carryover: $1M SC Special Projects: $1.2M Circuits & hubs: $12.7M SC funding: $20.1M Other DOE: $3.8M Approximate Budget Categories
  • 13. A Changing Science Environment is the Key Driver of the Next Generation ESnet Large-scale collaborative science – big facilities, massive data, thousands of collaborators – is now a significant aspect of the Office of Science (“SC”) program SC science community is almost equally split between Labs and universities SC facilities have users worldwide Very large international (non-US) facilities (e.g. LHC and ITER) and international collaborators are now a key element of SC science Distributed systems for data analysis, simulations, instrument operation, etc., are essential and are now common (in fact dominate in the data analysis systems that now generate 50% of all ESnet traffic)
  • 14. LHC will be the largest scientific experiment and generate the most data the scientific community has ever tried to manage. The data management model involves a world-wide collection of data centers that store, manage, and analyze the data and that are integrated through network connections with typical speeds in the 10+ Gbps range. closely coordinated and interdependent distributed systems that must have predictable intercommunication for effective functioning [ICFA SCIC]
  • 15. FNAL Traffic is Representative of all CMS Traffic Accumulated data (Terabytes) received by CMS Data Centers (“tier1” sites) and many analysis centers (“tier2” sites) during the past 12 months (15 petabytes of data) [LHC/CMS] FNAL
  • 16. When A Few Large Data Sources/Sinks Dominate Traffic it is Not Surprising that Overall Network Usage Follows the Patterns of the Very Large Users - This Trend Will Reverse in the Next Few Weeks as the Next Round of LHC Data Challenges Kicks Off FNAL Outbound Traffic
  • 17. What Does this Aggregate Traffic Look like at a Site FNAL Outbound CMS Traffic Max= 1064 MBy/s (8.5 Gb/s), Average = 394 MBy/s (3.2 Gb/s) Mbytes/sec 8 Gb/s
  • 18. Total ESnet Traffic Exceeds 2 Petabytes/Month 2.7 PBytes in July 2007 1 PBytes in April 2006 ESnet traffic historically has increased 10x every 47 months Overall traffic tracks the very large science use of the network
  • 19. Large-Scale Data Analysis Systems (Typified by the LHC) Generate Several Types of Requirements The systems are data intensive and high-performance , typically moving terabytes a day for months at a time The system are high duty-cycle , operating most of the day for months at a time in order to meet the requirements for data movement The systems are widely distributed – typically spread over continental or inter-continental distances Such systems depend on network performance and availability , but these characteristics cannot be taken for granted, even in well run networks, when the multi-domain network path is considered The applications must be able to get guarantees from the network that there is adequate bandwidth to accomplish the task at hand The applications must be able to get information from the network that allows graceful failure and auto-recovery and adaptation to unexpected network conditions that are short of outright failure This slide drawn from [ICFA SCIC]
  • 20. Enabling Large-Scale Science These requirements are generally true for systems with widely distributed components in order to be reliable and consistent in performing the sustained, complex tasks of large-scale science To meet these requirements networks must provide high capacity communication capabilities that are service-oriented: configurable, schedulable, predictable, reliable, and informative and the network and its services must be scalable
  • 21. ESnet Response to the Requirements For capacity: A new network architecture and implementation strategy Provide two networks: IP and circuit-oriented Science Data Netework Rich and diverse network topology for flexible management and high reliability Dual connectivity at every level for all large-scale science sources and sinks A partnership with the US research and education community to build a shared, large-scale, R&E managed optical infrastructure For service-oriented capabilities Develop and deploy a virtual circuit service Develop the service cooperatively with the networks that are intermediate between DOE Labs and major collaborators to ensure and-to-end interoperability Develop and deploy service-oriented, user accessible network monitoring systems Provide collaboration support services: Public Key Certificate infrastructure, video/audio/data conferencing
  • 22. Networking for Large-Scale Science It is neither financially reasonable, nor necessary, to address this tremendous growth in traffic by growing the IP network The routers in the IP network are capable of routing (determining the path to the destination) of 6,000,000 packets a second, and this requires expensive hardware (~$1.5M/router) The paths of the large-scale science traffic are quite static - routing is not needed Therefore, ESnet is obtaining the required huge increase in network capacity by using a switched / circuit-oriented network (the switches are about 20-25% of the cost of routers)
  • 23. ESnet4 Network Architecture ESnet sites ESnet IP hubs Metro area rings (MANs) Circuit connections to other science networks (e.g. USLHCNet) ESnet SDN hubs The multiple interconnected rings provide high reliability IP network physical infrastructure is packet over un-switched Ethernet Science Data Network (SDN) physical infrastructure is switched Ethernet circuits Core-to-site-to-core dual connection for those sites not on metro area rings Other IP networks
  • 24. Approach to Satisfying Capacity Requiremetns - a Scalable Optical Infrastructure b Houston Clev. Houston Kansas City Boston Sunnyvale Seattle San Diego Albuquerque El Paso Chicago New York City Washington DC Atlanta Denver Los Angeles Nashville 20G 20G 20G 20G 20G 20G 20G ESnet 4 Backbone Optical Circuit Configuration (December, 2008) DC 10 Gb/s SDN core (NLR) 10/2.5 Gb/s IP core (QWEST) 10 Gb/s IP core (Level3) 10 Gb/s SDN core (Level3) MAN rings (≥ 10 G/s) Lab supplied links Future ESnet Hub ESnet Hub (IP and SDN devices)
  • 25. Typical ESnet and Internet2 Core Hub fiber east fiber west fiber north/south Ciena CoreDirector grooming device Infinera DTN Internet2 ESnet support devices: measurement out-of-band access monitoring security support devices: measurement out-of-band access monitoring …… . Network Testbed Implemented as a Optical Overlay dynamically allocated and routed waves (future) ESnet metro-area networks Internet2/Level3/Infinera National Optical Infrastructure optical interface to R&E Regional Nets T640 various equipment and experimental control plane management systems BMM BMM M320 IP core SDN core switch
  • 26. Conceptual Architecture of the Infinera DTN Digital Terminal 100 Gb/s 100 Gb/s BMM 100 Gb/s DLM 10x10 Gb/s 10 Gb/s Band Mux Module (BMM) Multiplexes 100Gb/s bands onto 400 Gb/s or 800 Gb/s line Digital Line Module (DLM) 100Gb/s DWDM line module (10  x 10Gb/s) Integrated digital switch enables add/drop & grooming at 2.5Gb/s (ODU-1) granularity Tributary Adapter Module (TAM) Provides customer facing optical interface Supports variety of data rates and service types Up to 5 TAMs per DLM Tributary Optical Module (TOM) Pluggable client side optical module (XFP, SFP) Supports 10GE, OC48&OC192 SONET Optical Supervisory Channel (OSC) Management Plane Traffic—Includes traffic from the remote management systems to access network elements for the purpose of managing them Control Plane Traffic—GMPLS routing and signaling control protocol traffic Datawire Traffic—Customer management traffic by interconnecting customer’s 10Mbps Ethernet LAN segments at various sites through AUX port interfaces 100 Gb/s Notes: This conceptual architecture is based on W. Johnston’s understanding of the Infinera DTN All channels are full duplex, which is not made explicit here The switch fabric operates within a DLM and also interconnects the DLMs The Management Control Module is not shown intra- and inter-DLM switch fabric WAN Control Processor OE/EO TAM TOM TOM OSC
  • 27. ESnet Metropolitan Area Network Ring Architecture for High Reliability Sites MAN fiber ring: 2-4 x 10 Gbps channels provisioned initially, with expansion capacity to 16-64 Site router Site gateway router ESnet production IP core Site Large Science Site ESnet managed virtual circuit services tunneled through the IP backbone Virtual Circuits to Site Site LAN Site edge router IP core router ESnet SDN core ESnet IP core hub ESnet switch ESnet managed λ / circuit services SDN circuits to site systems ESnet MAN switch Virtual Circuit to Site SDN core west SDN core east IP core west IP core east ESnet production IP service Independent port card supporting multiple 10 Gb/s line interfaces T320 US LHCnet switch SDN core switch SDN core switch US LHCnet switch MAN site switch MAN site switch
  • 28. Typical Medium Size ESnet4 Hub DC power controllers 10 Gbps network tester Cisco 7609, Science Data Network switch (about $365K, list) Juniper M320, core IP router, (about $1.65M, list) Juniper M7i, IP peering router, (about $60K, list) secure terminal server (telephone modem access) local Ethernet DC power controllers
  • 29. Note that the major ESnet sites are now directly on the ESnet “core” network Denver Seattle Sunnyvale LA San Diego Chicago Raleigh Jacksonville KC El Paso Albuq. Tulsa Clev. Boise Wash. DC Salt Lake City Portland Baton Rouge Houston Pitts. NYC Boston Philadelphia Indianapolis (>1  ) Atlanta Nashville OC48 (0) (1) 5  4  4  4  4  4  4  5  5  5  3  4  4  5  5  5  5  5  4  (7) (17) (19) (20) (22) (23) (29) (28) (8) (16) (32) (2) (4) (5) (6) (9) (11) (13) (25) (26) (10) (12) (27) (14) (24) (15) (30) 3  3  (3) (21) West Chicago MAN Long Island MAN bandwidth into and out of FNAL is equal to, or greater, than the ESnet core bandwidth Layer 1 optical nodes at eventual ESnet Points of Presence ESnet IP switch only hubs ESnet IP switch/router hubs ESnet SDN switch hubs Layer 1 optical nodes not currently in ESnet plans Lab site ESnet IP core (1  ) ESnet Science Data Network core ESnet SDN core, NLR links (existing) Lab supplied link LHC related link MAN link International IP Connections Internet2 circuit number (20) BNL 32 AoA, NYC USLHCNet JLab ELITE ODU MATP Wash., DC FNAL 600 W. Chicago Starlight ANL USLHCNet Atlanta MAN 180 Peachtree 56 Marietta (SOX) ORNL Wash., DC Houston Nashville San Francisco Bay Area MAN LBNL SLAC JGI LLNL SNLL NERSC
  • 30. ESnet4 IP + SDN, 2011 Configuration Denver Seattle Sunnyvale LA San Diego Chicago Raleigh Jacksonville KC El Paso Albuq. Tulsa Clev. Boise Wash. DC Salt Lake City Portland Baton Rouge Houston Pitts. NYC Boston Philadelphia Indianapolis (>1  ) Atlanta Nashville OC48 (0) (1) 5  4  4  4  4  4  4  5  5  5  3  4  4  5  5  5  5  5  4  (7) (17) (19) (20) (22) (23) (29) (28) (8) (16) (32) (2) (4) (5) (6) (9) (11) (13) (25) (26) (10) (12) (27) (14) (24) (15) (30) 3  3  (3) (21) Layer 1 optical nodes at eventual ESnet Points of Presence ESnet IP switch only hubs ESnet IP switch/router hubs ESnet SDN switch hubs Layer 1 optical nodes not currently in ESnet plans Lab site ESnet IP core (1  ) ESnet Science Data Network core ESnet SDN core, NLR links (existing) Lab supplied link LHC related link MAN link International IP Connections Internet2 circuit number (20)
  • 31. Cleveland Europe (GÉANT) Asia-Pacific New York Chicago Washington DC Atlanta CERN (30 Gbps) Seattle Albuquerque Australia San Diego LA Denver South America (AMPATH) South America (AMPATH) Canada (CANARIE) CERN (30 Gbps) Canada (CANARIE) Asia-Pacific Asia Pacific GLORIAD (Russia and China) Boise Houston Jacksonville Tulsa Boston Science Data Network Core IP Core Kansas City Australia Core network fiber path is ~ 14,000 miles / 24,000 km Sunnyvale El Paso ESnet4 Planed Configuration 40-50 Gbps in 2009-2010, 160-400 Gbps in 2011-2012 1625 miles / 2545 km 2700 miles / 4300 km Production IP core ( 10Gbps ) ◄ SDN core ( 20-30-40Gbps ) ◄ MANs (20-60 Gbps) or backbone loops for site access International connections IP core hubs Primary DOE Labs SDN (switch) hubs High speed cross-connects with Ineternet2/Abilene Possible hubs
  • 32. Multi-Domain Virtual Circuits as a Service Guaranteed bandwidth service User specified bandwidth - requested and managed in a Web Services framework Traffic isolation and traffic engineering Provides for high-performance, non-standard transport mechanisms that cannot co-exist with commodity TCP-based transport Enables the engineering of explicit paths to meet specific requirements e.g. bypass congested links, using lower bandwidth, lower latency paths Secure connections The circuits are “secure” to the edges of the network (the site boundary) because they are managed by the control plane of the network which is isolated from the general traffic End-to-end (cross-domain) connections between Labs and collaborating institutions Reduced cost of handling high bandwidth data flows Highly capable routers are not necessary when every packet goes to the same place Use lower cost (factor of 5x) switches to relatively route the packets
  • 33. Virtual Circuit Service Functional Requirements Support user/application VC reservation requests Source and destination of the VC Bandwidth, latency, start time, and duration of the VC Traffic characteristics (e.g. flow specs) to identify traffic designated for the VC Manage allocations of scarce, shared resources Authentication to prevent unauthorized access to this service Authorization to enforce policy on reservation/provisioning Gathering of usage data for accounting Provide virtual circuit setup and teardown mechanisms and security Widely adopted and standard protocols (such as MPLS and GMPLS) are well understood within a single domain Cross domain interoperability is the subject of ongoing, collaborative development secure and-to-end connection setup is provided by the network control plane accommodate heterogeneous circuit abstraction (e.g.. MPLS, GMPLS, VLANs, VCAT/LCAS) Enable the claiming of reservations Traffic destined for the VC must be differentiated from “regular” traffic Enforce usage limits Per VC admission control polices usage, which in turn facilitates guaranteed bandwidth Consistent per-hop QoS throughout the network for transport predictability
  • 34. ESnet Virtual Circuit Service: OSCARS ( O n-demand S ecured C ircuits and A dvanced R eservation S ystem) User Application Software Architecture (see ref. 9) Web-Based User Interface (WBUI) will prompt the user for a username/password (or Grid proxy cert.) and forward it to the AAAS. Authentication, Authorization, and Auditing Subsystem (AAAS) will handle access, enforce policy, and generate usage records. Bandwidth Scheduler Subsystem (BSS) will track reservations and map the state of the network (present and future). Path Setup Subsystem (PSS) will setup and teardown the on-demand paths (LSPs). User Instructions to routers and switches to setup/teardown LSPs Web-Based User Interface Authentication, Authorization, And Auditing Subsystem Bandwidth Scheduler Subsystem Path Setup Subsystem Reservation Manager User app request via AAAS User feedback Human User User request via WBUI
  • 35. Environment of Science is Inherently Multi-Domain End points will be at independent institutions – campuses or research institutes - that are served by ESnet, Abilene, GÉANT, and their regional networks Complex inter-domain issues – typical circuit will involve five or more domains - of necessity this involves collaboration with other networks For example, a connection between FNAL and DESY involves five domains, traverses four countries, and crosses seven time zones FNAL (AS3152) [US] ESnet (AS293) [US] GEANT (AS20965) [Europe] DFN (AS680) [Germany] DESY (AS1754) [Germany]
  • 36. Motivation : For a virtual circuit service to be successful, it must Be end-to-end, potentially crossing several administrative domains Have consistent network service guarantees throughout the circuit Observation: Setting up an intra-domain circuit is easy compared with coordinating an inter-domain circuit Issues: Cross domain authentication and authorization A mechanism to authenticate and authorize a bandwidth on-demand (BoD) circuit request must be agreed upon in order to automate the process Multi-domain Acceptable Use Policies (AUPs) Domains may have very specific AUPs dictating what the BoD circuits can be used for and where they can transit/terminate Domain specific service offerings Domains must have way to guarantee a certain level of service for BoD circuits Security concerns Are there mechanisms for a domain to protect itself (e.g. RSVP filtering) Mostly “no” - so there will not be signaling across domain boundaries Inter-domain Reservations: A Hard Problem
  • 37. Environment of Science is Inherently Multi-Domain In order to set up end-to-end circuits across multiple domains without violating security or allocation management policy of any of the domains, the process of setting up end-to-end circuits is handled by a broker that Negotiates for bandwidth that is available for this purpose Requests that the domains each establish the “meet-me” state so that the circuit in one domain can connect to a circuit in another domain (thus maintaining domain control over the circuits) FNAL (AS3152) [US] ESnet (AS293) [US] GEANT (AS20965) [Europe] DFN (AS680) [Germany] DESY (AS1754) [Germany] broker Local domain circuit manager Local domain circuit manager Local domain circuit manager Local domain circuit manager Local domain circuit manager
  • 38. OSCARS Update OSCARS is the ESnet control plane mechanism Establishes the circuit path Transport is then via MPLS and VLANs Completed porting OSCARS from Perl to Java to better support web-services This is now the common code base for OSCARS and I2's BRUW Paper on OSCARS was accepted by the IEEE GridNets Collaborative efforts Working with I2 and DRAGON to support interoperability between OSCARS/BRUW and DRAGON currently in the process of installing an instance of DRAGON in ESnet Working with I2, DRAGON, and TeraPaths (Brookhaven Lab) to determine an appropriate interoperable AAI (authentication and authorization infrastructure) framework (this is in conjunction with GEANT2's JRA5) Working with DICE Control Plane group to determine schema and methods of distributing topology and reachability information DICE=Internet2, ESnet, GEANT, CANARIE/UCLP; see http:// www.garr.it/dice/presentation.htm for presentations from the last meeting Working with Tom Lehman (DRAGON), Nagi Rao (USN), Nasir Ghani (Tennessee Tech) on multi-level, multi-domain hybrid network performance measurements (funded by OASCR)
  • 39. Monitoring as a Service-Oriented Communications Service perfSONAR is a community effort to define network management data exchange protocols, and standardized measurement data gathering and archiving Path performance monitoring is an example of a perfSONAR application provide users/applications with the end-to-end, multi-domain traffic and bandwidth availability provide real-time performance such as path utilization and/or packet drop Multi-domain path performance monitoring tools are in development based on perfSONAR protocols and infrastructure One example – Traceroute Visualizer [TrViz] – has been deployed in about 10 R&E networks in the US and Europe that have deployed at least some of the required perfSONAR measurement archives to support the tool
  • 40. perfSONAR archtiecture performance GUI event subscription service human user client (e.g. part of an application system communication service manager) topology aggregator measurement archive(s) architectural relationship interface service measurement point examples real-time end-to-end performance graph (e.g. bandwidth or packet loss vs. time) historical performance data for planning purposes event subscription service (e.g. end-to-end path segment outage) layer measurement export measurement export measurement export m1 m5 m3 m6 The measurement points (m1….m6) are the real-time feeds from the network or local monitoring devices The Measurement Export service converts each local measurment to a standard format for that type of measurment network domain 1 network domain 2 network domain 3 service locator path monitor m1 m2 m3 m1 m4 m3
  • 41. Traceroute Visualizer Forward direction bandwidth utilization on application path from LBNL to INFN-Frascati (Italy) traffic shown as bars on those network device interfaces that have an associated MP services (the first 4 graphs are normalized to 2000 Mb/s, the last to 500 Mb/s) 1 ir1000gw (131.243.2.1) 2 er1kgw 3 lbl2-ge-lbnl.es.net 4 slacmr1-sdn-lblmr1.es.net (GRAPH OMITTED) 5 snv2mr1-slacmr1.es.net (GRAPH OMITTED) 6 snv2sdn1-snv2mr1.es.net 7 chislsdn1-oc192-snv2sdn1.es.net (GRAPH OMITTED) 8 chiccr1-chislsdn1.es.net 9 aofacr1-chicsdn1.es.net (GRAPH OMITTED) 10 esnet.rt1.nyc.us.geant2.net (NO DATA) 11 so-7-0-0.rt1.ams.nl.geant2.net (NO DATA) 12 so-6-2-0.rt1.fra.de.geant2.net (NO DATA) 13 so-6-2-0.rt1.gen.ch.geant2.net (NO DATA) 14 so-2-0-0.rt1.mil.it.geant2.net (NO DATA) 15 garr-gw.rt1.mil.it.geant2.net (NO DATA) 16 rt1-mi1-rt-mi2.mi2.garr.net 17 rt-mi2-rt-rm2.rm2.garr.net (GRAPH OMITTED) 18 rt-rm2-rc-fra.fra.garr.net (GRAPH OMITTED) 19 rc-fra-ru-lnf.fra.garr.net (GRAPH OMITTED) 20 21 www6.lnf.infn.it (193.206.84.223) 189.908 ms 189.596 ms 189.684 ms link capacity is also provided
  • 42. Federated Trust Services Remote, multi-institutional, identity authentication is critical for distributed, collaborative science in order to permit sharing widely distributed computing and data resources, and other Grid services Public Key Infrastructure (PKI) is used to formalize the existing web of trust within science collaborations and to extend that trust into cyber space The function, form, and policy of the ESnet trust services are driven entirely by the requirements of the science community and by direct input from the science community International scope trust agreements that encompass many organizations are crucial for large-scale collaborations The service (and community) has matured to the point where it is revisiting old practices and updating and formalizing them
  • 43. Federated Trust Services – Support for Large-Scale Collaboration CAs are provided with different policies as required by the science community DOEGrids CA has a policy tailored to accommodate international science collaboration NERSC CA policy integrates CA and certificate issuance with NIM (NERSC user accounts management services) FusionGrid CA supports the FusionGrid roaming authentication and authorization services, providing complete key lifecycle management DOEGrids CA NERSC CA FusionGrid CA …… CA ESnet root CA See www.doegrids.org
  • 44. DOEGrids CA (Active Certificates) Usage Statistics * Report as of Jan 17, 2008 US, LHC ATLAS project adopts ESnet CA service
  • 45. DOEGrids CA Usage - Virtual Organization Breakdown * DOE-NSF collab. & Auto renewals ** OSG Includes (BNL, CDF, CIGI, CMS, CompBioGrid, DES, DOSAR, DZero, Engage, Fermilab, fMRI, GADU, geant4, GLOW, GPN, GRASE, GridEx, GROW, GUGrid, i2u2, ILC, iVDGL, JLAB, LIGO, mariachi, MIS, nanoHUB, NWICG, NYGrid, OSG, OSGEDU, SBGrid, SDSS, SLAC, STAR & USATLAS)
  • 46. ESnet Conferencing Service (ECS) An ESnet Science Service that provides audio, video, and data teleconferencing service to support human collaboration of DOE science Seamless voice, video, and data teleconferencing is important for geographically dispersed scientific collaborators Provides the central scheduling essential for global collaborations ECS serves about 1600 DOE researchers and collaborators worldwide at 260 institutions Videoconferences - about 3500 port hours per month Audio conferencing - about 2300 port hours per month Data conferencing - about 220 port hours per month Web-based, automated registration and scheduling for all of these services
  • 48. ECS Video Collaboration Service High Quality videoconferencing over IP and ISDN Reliable, appliance based architecture Ad-Hoc H.323 and H.320 multipoint meeting creation Web Streaming options on 3 Codian MCU’s using Quicktime or Real 3 Codian MCUs with Web Conferencing Options 120 total ports of video conferencing on each MCU (40 ports per MCU) 384k access for video conferencing systems using ISDN protocol Access to audio portion of video conferences through the Codian ISDN Gateway
  • 49. ECS Voice and Data Collaboration 144 usable ports Actual conference ports readily available on the system. 144 overbook ports Number of ports reserved to allow for scheduling beyond the number of conference ports readily available on the system. 108 Floater Ports Designated for unexpected port needs. Floater ports can float between meetings, taking up the slack when an extra person attends a meeting that is already full and when ports that can be scheduled in advance are not available. Audio Conferencing and Data Collaboration using Cisco MeetingPlace Data Collaboration = WebEx style desktop sharing and remote viewing of content Web-based user registration Web-based scheduling of audio / data conferences Email notifications of conferences and conference changes 650+ users registered to schedule meetings (not including guests)
  • 50. Summary ESnet is currently satisfying its mission by enabling SC science that is dependant on networking and distributed, large-scale collaboration ESnet has put considerable effort into gathering requirements from the DOE science community, and has a forward-looking plan and expertise to meet the five-year SC requirements A DOE-lead external review of ESnet (Feb, 2006) strongly endorsed the plan for ESnet4
  • 51. References High Performance Network Planning Workshop, August 2002 http:// www.doecollaboratory.org/meetings/hpnpw Science Case Studies Update, 2006 (contact eli@es.net) DOE Science Networking Roadmap Meeting, June 2003 http://www.es.net/hypertext/welcome/pr/Roadmap/ index.html DOE Workshop on Ultra High-Speed Transport Protocols and Network Provisioning for Large-Scale Science Applications, April 2003 http://www.csm.ornl.gov/ghpn/wk2003 Science Case for Large Scale Simulation, June 2003 http:// www.pnl.gov /scales/ Workshop on the Road Map for the Revitalization of High End Computing, June 2003 http:// www.cra.org/Activities/workshops/nitrd http://www.sc.doe.gov/ascr/20040510_hecrtf.pdf (public report) ASCR Strategic Planning Workshop, July 2003 http://www.fp-mcs.anl.gov/ascr-july03spw Planning Workshops-Office of Science Data-Management Strategy, March & May 2004 http://www-conf.slac.stanford.edu/dmw2004 For more information contact Chin Guok ( [email_address] ). Also see http://www.es.net/ oscars [LHC/CMS] http://cmsdoc.cern.ch/cms/aprom/phedex/prod/Activity::RatePlots?view=global [ICFA SCIC] “Networking for High Energy Physics.” International Committee for Future Accelerators (ICFA), Standing Committee on Inter-Regional Connectivity (SCIC), Professor Harvey Newman, Caltech, Chairperson. http://monalisa.caltech.edu:8080/Slides/ICFASCIC2007/ [E2EMON] Geant2 E2E Monitoring System – developed and operated by JRA4/WI3, with implementation done at DFN http://cnmdev.lrz-muenchen.de/e2e/html/G2_E2E_index.html http://cnmdev.lrz-muenchen.de/e2e/lhc/G2_E2E_index.html [TrViz] ESnet PerfSONAR Traceroute Visualizer https://performance.es.net/cgi-bin/level0/perfsonar-trace.cgi