Cloud

Lync – A Tale of Stretching the Limits of Supportability

This blog post will highlight the infrastructure abilities of Lync when thinking a bit outside of the box on how to design Lync to meet very specific needs.  It’s not that I am condoning deploying Lync outside of supportability, but rather just showing that Lync truly can be flexible.
I recently finished up a global deployment of Lync Server 2013 where it took about 5 months to complete just the planning and build of the infrastructure alone.  During the planning and build we mainly touched on getting the overall Topology designed correctly, including SIP Domains, Certificates, HA, DR, capacity planning and a “Hidden DMZ” and that is why it took so long…Or maybe because the client was a multibillion dollar company who had to move cautiously?  Oh and yes, I did say Hidden DMZ, you read it correctly.  If you interested in the gory details of such a design and the “Hidden DMZ”, not to be confused with their traditional DMZ, read on because I do have to say it’s pretty interesting….
I do want to put a Disclaimer on this post before I continue;  The overall concept of this design was spawned before I was assigned this project, so when you read about this unorthodox design you must understand the conversations of this design were already signed off on by Microsoft yet the risks of such a design were still conveyed by Microsoft, myself and a fellow Microsoft MCM/MCSM.   The organization is one of the largest companies in the world, so you can only imagine how that played in their favor and was viewed as a major win.  The goal of this project was to bring their 3rd party hosted conferencing solution to an On-Premise Solution utilizing Lync Server 2013 for close to 100k users, save the re-occurring hosted conferencing cost while continuing to grow Enterprise Voice.  Also, when you read through this blog post, you will inevitably start looking beyond Lync and start to question why this organization practices such security tactics.  For the sake of the Lync Design, understand that this organization has an outstanding Security division with protections in place that are far greater than we will understand, so unless you are qualified as a security expert don’t get stuck on why they choose to do what they do.  Lync Server simply had adhere, to the best of its abilities, to the organizations tight restrictions, so getting Lync into a position to be accepted on the perimeter (in the DMZ) was a challenge  of itself.
Setting the Stage:
So I’ll set the tone with the environment first and foremost.  I came into the project about a 1/3 of the way through the planning stages, and the VERY first thing mentioned to me in my kickoff meeting was “We are going to put Front End Pools in the DMZ….will it work?”.  Well, needless to say I’m thinking; “What the hell did I just get into?”.  I took the comment in stride because I didn’t know the full background, but at that moment I simply took the supportability route and cautioned them with why that wasn’t such a good idea.  As the project roared on, it came to light that they simply wanted to put 2 Pools in the DMZ for user authentication, but not actually Home users themselves to these pools.  Lync veterans, I know what you are thinking – “Isn’t that just a Director and how a Director role was deployed with OCS RTM/R2?”, and to all the novice Lync Admins, yes, essentially, this pool will be acting like a Director.  So based on how I decided to design the Lync Server 2013 central site in the Western Hemisphere of this project, a traditional Lync Deployment would mimic this visio with the Directors located on the protected LAN:
So here is where the challenge surfaced with the Director role and the server placement of such a traditional design:
Traditional_2
1.) The Director role, even though greatly explained to this organization by myself (a Lync MCM) and Microsoft was simply not good enough in their eyes to meet the security needs of the organization so they insisted on a complete Front End Pool.
2.) The pool they wanted as the “Director” also needed to be located in a protected are of the network, another purpose built “Hidden DMZ” so the authentication happened in this bubble.
3.) They only needed this hidden pool to authenticate Anonymous User join for conferencing because 2-factor authentication isn’t an option and Digest Authentication used for Anonymous users was viewed as a negative
So knowing how a traditional design would look, we had to modify such to now look like this visio with only “Meet URL” passing through the “Hidden DMZ”:
different_2
So this fulfills the same role as the Director, when anonymous users join, they initially hit this pool and authenticate before being “Shuffled off” to the Home Pool of the conference organizer; this met their needs.  So, now onto understanding how to make this work.  I cautioned against this design with the normal points; You don’t know how things change in the future, you never know all the ports this pool needs, you never know if an update will break this, Microsoft does not support this design in the documents….etc.  So I had to do some simple tracing to find the minimal amount of ports required from a Lync standpoint.  Now keep in mind, I did not consult on ports needed by Domain Joined machines in the DMZ.  The practice of this organization already deploys domain joined machines to their DMZ and protects them how they see fit, so I only looked after Lync ports.  Again, don’t question that practice, just know it is what it is.
During my testing and once again during go live we determined we only needed 3 ports open to the LAN for this to work and 1 from the traditional DMZ coming from the TMG.
1.) Port 445 – Bidirectional to/from LAN
2.) Port 444- Bidirectional to/from LAN
3.) Port 5061 – Bidirectional to/from LAN
4.) Port 4443 – One way to the Hidden DMZ from the traditional DMZ
The Hidden DMZ looked like this:
Hidden DMZ
After this is all said and done, with FE’s, SQL Servers now placed into the purpose built “Hidden DMZ”, the redirect work flawlessly as expected.  In the Western Hemisphere, the Hidden DMZ now serves 2 different user pools and in the Eastern Hemisphere there is another purpose built “Hidden DMZ” that serves 2 additional pools there as well.  Each is fully redundant offering High Availability so all traffic coming from either hemisphere can land on either hidden DMZ to keep conferencing up and running.
Now on to answer a few outstanding questions I know are going through your head.
1.) What about Edge communication? 
The Edge was addressed by using Lync-Solutions Security filter to protect against DDOS attacks and force users to use TLS-DSK.  Home PC’s are not allowed for connectivity with this organization as they can only connect remotely with company issued PC/laptops because of their security requirements.  This being said, they could restrict Lync to only signing in if they had the Lync issued Certificate downloaded to their machine.  This is the closest to 2-factor authentication they could achieve.
2.) What about other URLs, like Web Services, DialIn and LyncDiscover?
Again, the security filter was in place on the TMGs to guard against DDOS attacks.  The other web traffic was “OK’d” to hit the Front Ends, such as Web Services URLs.  You simply cannot redirect home pool Web Services URLs as that is simply a hard requirement to land on the respective pool.  LyncDiscover is still covered by a certificate, even though it is not an authenticated request on initial query anyway, so there is no such worry there.  We could have directed that traffic to the Hidden DMZ as well, but I recommended leaving it hit the internal servers with the first hit.  The authentication of a user is performed up to 2 times anyway, first against the pool associated to LyncDiscover and then once again if you are homed on a different pool.  Again, this is okay as they wanted the Hidden DMZ for anonymous authentication.
3.) What about mobility?
Well, this is a good one.  There simply was no good answer for mobility.  This organization is huge on 2-factor authentication, especially from a mobile perspective.  The Lync mobility client does not offer 2-factor natively, so in this case it just simply was not available, not even from third party vendors.  Because of this lack thereof, this organization had to make an exception until a solution is available.  Now, there were a few “attempts” by 3rd party vendors who said they could do it, but anyone who knows Lync at its deepest levels and how Lync authenticates knows you can’t simply slap a 2-factor authentication solution into IIS and expect it to work, it’s just not possible without a damning ripple effect.  With that said, however, Microsoft has acquired a mobile 2-factor provider, called “Phone Factor” who previously built 2-factor solutions for other Microsoft platforms.  One could think that Microsoft has plans to build this into the product in the future.  I’m speculating, but I think it’s a pretty accurate guess.
So there you have it, within this wordy blog post, an interesting situation that turned successful with a little out of the box thinking and thorough testing.  I’ll leave you with some fun facts from this environment:

  1. Environment is built with complete HA/DR functionality to keep conferencing up at all times
  2. Support for 70k-140k users.  Nobody could agree on a set number…just think really big.
  3. Their intent is to reduce costs by giving people an option to bring on premise some of their external conferencing volume
  4. 52 Servers with 100% dedication to Lync, not shared in anyway.  This number includes TMGs and SQL Servers.  56 servers      if you include shared Witnesses.
  5. The whole environment is Hardware with exception of the TMGs and the “Hidden DMZs”
  6. There are 2 “Hidden DMZs” as they call it.  Both Hidden DMZs contain a Lync pool with 2 FE’s/pool and 2 SQL Servers in mirror configuration.
  7. FE’s and SQL Servers are both entirely in this “Hidden DMZ” which is inside another DMZ…I guess.
  8. 6 Total Pools, 4 user pools and these 2 Hidden DMZ Pools (think Directors)
  9. 8 Edge Servers
  10. They built a whole new VMWare environment inside these DMZs specifically for these hidden dmz pools and TMGs.  This included new switches, hardware, network configuration and such.  Everything that goes with building a new VMWare environment and securing it.
  11. 12 GoDaddy Certificates.
  12. Lync Edge and TMG Security Filters to block potential DDOS attacks and to force TLS-DSK.
  13. 50ish SIP Domains
  14. 1 Certificate alone has 70+ SAN entries

Discussions and comments welcome

About the Author

I currently hold the Microsoft Certified Master on Lync Server 2010 certificatoin and work as a Senior Technical Consultant at Perficient, specializing in Unified Communications design and deployments. My history in IT dates back 15 years with all my experience coming primarily from Microsoft Technologies. I believe the Microsoft Unified Communiations community is a very close and talented group of engineers who genuinely enjoy the technologies and collaborating with one another to help the technologies dominate the marketplace.

More from this Author

Thoughts on “Lync – A Tale of Stretching the Limits of Supportability”

  1. Hi Danny,
    Thanks for posting. What pains exactly are you going through? Can you elaborate?
    As for post support, yes MS agreed to support this customer to best effort, but because it is so highly customized, the client would have to make some changes to get back to supportability if the troubleshooting get’s to a certain point. This of course was an agreement between MS rep and the client. If you are going “out of bounds” of supportability within your deployment, it’s in your best interest to let your MS account rep know the situation and agree on terms. If it’s a smaller deployment with no MS backing, then it would probably be a bit more of a gamble moving forward. This is of course me giving you general advice without knowing your actual situation.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe to the Weekly Blog Digest:

Sign Up