When one FIM sync server just might not be enough?

I’ve been thinking a lot lately about the inherent scalability challenges of the FIM sync engine as we seek to push it harder and harder, throwing more and more identities and other related objects at it.  Is there a limit, and how long before we hit it?   We already know to expect long initial load/full sync times for systems with upwards of a million objects under synchronization, and although each service pack seems to include new ways of improving this, at what point do we need to step back and ask ourselves the inevitable question … is there another way?

Inspired by some left field ideas from the wider FIM community @ TEC2012 Europe this week, I’ve started to conceptualize various ways in which one might go about mitigating against the inevitable performance degradation that comes with increased numbers, in particular the number of relationships we’re modelling between them.  I’ve been thinking how maybe the initial load should be achieved by bypassing the sync engine (I recall Blain Checkley posting about doing this not long ago), and then I thought again about the new async FIM MA export mode and thought perhaps that isn’t the place to focus on so much now, given the through-put improvements being seen.  That lead me to the next thought … how do I hedge against the inevitable long initial import/sync times?  That was when it struck me … maybe I already have the key and didn’t realize?

Here’s my thought … take my Replay MA idea, only this time what if we use it not only on the regular instance of the FIM sync engine, but on a completely different one?  What would that do for us?

Thought goes like this … use the FIM MA for all updates to the FIM Portal … as it always should.  However, use a secondary FIM Sync engine without any FIM Portal of its own at all.  We all should know by now that every FIM service can be linked with one and only one sync engine right?  However, there’s nothing to say you can’t take a Replay of the FIM MA and play it through a completely separate FIM sync server instance!  What does that give us?

What this should allow is the off-loading of everything but import flows for the FIM service to (one or more) separate sync services, thereby distributing the load.  While this will still mean that the FIM service will need to be loaded before any downstream sync (sourced from a FIM MA replay) can occur, the time it takes to reach the initial point where the export to the FIM service can begin could be much less than it would otherwise.  What’s more, the subsequent times for “downstream” FIM sync service sync/exports should be much less too.  The only challenge I can see is how to join/project from the FIM Replay MA onto the FIM metaverse of the secondary FIM sync services given we can’t use the mv object ID in the same way as before.  Still, that should be manageable.

Will give this some more thought, but in the meantime I’m excited by the possibilities here.

Posted in FIM (ForeFront Identity Manager) 2010 | Tagged , , , | 3 Comments

Tuning FIM MA exports during initial load (pre R2)

I wanted to share my findings on following the steps outlined in KB2417774 to improve FIM MA export performance on a pre-R2 implementation I’m working on (build 4.0.3617.2, with 200K metaverse objects under management).

When faced with thousands of pending exports (mostly adds) to the FIM Service and observing an export rate of approximately 5 every 30 seconds, I figured it was time to give this “async” mode a try … given that it now appears to be the default mode in FIM R2 anyhow.

The following config changes resulted in a throughput improvement which varied between 1 and 3 per second.  This was the kind of improvement I was hoping for … it meant that the exports were processed in a couple of hours instead of closer to 24 hours.  However I did find there were some drawbacks … which are all fine now that I know what to expect … so this post is for when you find yourself in the same situation (and also a diary note-to-self!!!).

Here is my current miiserver.exe.config setting for the FIM sync service (this goes immediately below the </startup> node):

<resourceSynchronizationClient 
 exportFetchResultsPollingTimerInSeconds="4"
 exportRequestsInProcessMaximum="100"
 exportWaitingForRequestsToProcessTimeoutInSeconds="600" />
<system.serviceModel>

… and here is the one for the FIM service (Microsoft.ResourceManagement.Service.exe.config … goes immediately below the </system.serviceModel> node):

<resourceManagementService externalHostName="myFIMServer.com" synchronizationExportThrottle="Limited" requestRecoveryMaxPerMinute="60" />

The first time I tried this, however, I used the “Unlimited” option and found that I eventually ran out of resources on my (slightly under-spec’d) SQL instance.  It also made accessing the FIM portal a problem (timeouts), and while I expected this because the KB warned me, I nonetheless found this too much of a hindrance in monitoring progress on the Request history.  I subsequently found that the above setting gave me the through-put I was looking for, whilst making the FIM portal more responsive.

I have since reset the Microsoft.ResourceManagement.Service.exe.config setting back to the following:

<resourceManagementClient resourceManagementServiceBaseAddress="myFIMServer.com" />

… noting that while the R2 notes say that this is now the default, there is nothing to say this is a recommended default pre-R2.  Having said that you may find your environment has more resources than mine, and this may be workable … but for the moment I feel much more comfortable with the default configuration once the initial load is done with.

There are a couple of catches to note when doing this:

  1. Make sure your tempdb database on your SQL server has room to grow.  By default this is possibly on your C: drive, as it was for me, and I noticed just in time that tempdb had grown to 17Gb (!!!) and I was running with about 3% free disk space.  Luckily I was able to (semi-) gracefully bring everything to a stop and remedy the situation by relocating this database to where I had diligently placed the FIM ones by following the simple steps outlined in this helpful SQL blog post.
  2. Make sure you’ve got plenty of RAM on your SQL server.  With tempdb growing like it does in this mode, you need to have RAM headroom to spare.  I doubled my lab VM (hosting SQL and the FIM sync service) from 8 to 16 Gb and would have doubled it again if I could!  As it was, I followed FIM SQL best practice by setting a RAM limit on SQL (7Gb) and found that this was at least workable.
  3. Expect the FIM service to take its time processing the queued requests.  If you chose the “Unlimited” option like I did first, expect FIM to potentially take hours to clear the queue.  Even with the “Limited” option you can expect delays.
  4. Recycle the FIMService and FIMSynchronizationService (and SQL if you can, to bring tempdb back to a reasonable size) after you’ve finished the initial load phase.  With resources stretched I found that I needed to do this to get a performing FIM environment as quickly as possible.
  5. Re-build the full text catalog on your FIMService database after the initial load (and defrag all the indexes on this db if you want to go one step further like me).

P.S. Thanks to those responsible for awarding me with an MVP this month … it’s a great honour (that’s the English spelling!) and I’m proud to say that with Carol having returned home from Geneva in the last couple of years, Australia (and also UNIFY!) now has 2 of the current headcount of 11 FIM MVPs :).  I appreciate your continuing support.

Posted in FIM (ForeFront Identity Manager) 2010 | Tagged | 3 Comments

When is a static FIM set dynamic?

Sometimes FIM can build you up just to cut you back down.  Just when you think you’ve designed the perfect set-based policy, with your custom schema and workflow activities written and tested, how many times do you discover that try as you might you can’t create that set filter you need?  That’s right – you come face to face with the dreaded unsupported filter definition.

For those of you for whom this has yet to happen, consider the following scenario – you have an AD group that is not managed by FIM, but you want to write some policy that fires for every user added to this group.  So you think at first that all you need to do is import/sync this group to FIM and write your set … but like Steffen here you come unstuck.  You can write your xpath statement in a search scope and it works fine, but not as a set filter!

So when this happens, what can you do?  Sure you can look at the above linked post, and others like it, and you too will work out that FIM needs you to maintain a redundant reference property in order to define a valid set definition to do the job.  I’ve done this plenty of times myself, and come up with a number of ways to achieve this outcome … ranging from custom workflows to using the sync engine and something like my Replay MA idea.

However, I’ve come up with another idea … the dynamic static set :).

Just because you can’t write a valid filter doesn’t mean you have to give up on your policy … just use a static set instead, and write a workflow to recalculate and update this static membership whenever you need to do this – e.g. use a request-based MPR to fire for your group object whenever it’s sync’d membership changes.

So what tools do you need to do this?

You could almost do this with the OOTB FIM Function Evaluator … if only it would do the following 2 things for you

  1. allow updates for objects other than those in context of a Request; and
  2. allow the update of a multi-value attribute (member) with the resource identifiers of all objects returned in your xpath.

Since you can’t do either of these with the OOTB component, like myself, you are going to have to write yourself a custom activity to do the above.  However, if you’ve followed my blog for a while you’ll have already seen my previous post on re-usable CRUD workflow activities, and you already have your own written by now :).

In any case, with a little imagination you too can overcome dynamic set limitations with “dynamic static sets”.

Posted in FIM (ForeFront Identity Manager) 2010 | Tagged | 1 Comment

Inbound Attribute Flow Dilemmas

For the past 4 weeks I’ve found myself taking a trip down memory lane with rules extensions … in VB.Net and not my preferred C# no less. Ironically this was my first FIM R2 experience, and I was hardly prepared to discover that everything old is new again :). What this has allowed me to do is re-balance the scales a little (well a lot actually) in the argument of declarative vs. non-declarative. While there have been several learnings for me in this time, and plenty of them about the R2 Lotus Notes MA, the observation I want to share is concerning IAF and inbound precedence.

When I first rolled up my sleeves to begin implementing one of the more complex sync designs I’ve seen, I was determined to stick to the KISS (Keep It Simple Stupid!) principle. I decided the best strategy to approach what is effectively a 5-way GAL sync was to focus exclusively on the inbound flows first, and only look at outbound once we were 100% confident the inbound was right. This turned out to be a sound strategy, given the number of times I changed my mind on how I would actually do it.

I started out surfacing as many IAFs as possible in the form of sync rules, but soon ran into several challenges, including the following:

  • one authoritative source was Lotus Notes, and with nearly all properties being multi-valued, with many mapping to single metaverse properties, I soon dispensed with plans to do any kind of declarative flow with this MA;
  • I needed to use multiple join rules to map data to Notes and AD (including advanced joins on the multi-valued InternetAddress, ShortName, ListName and FullName properties of the Notes MA CS), and quickly realised that pursuing multiple declarative rules here with variations in relationship was going to be at best still very clumsy … this was a no-brainer, so I decided if I used rules extensions for join rules that I could still combine these with portal-defined IAFs on a simple relationship such as objectSID.

I could go on with a few more … but if the above weren’t enough of an argument to give up on declarative IAFs altogether, the final nail in the declarative coffin for me was what happened when I was forced to delete and recreate an MA in my solution several times. I had to do this for two MAs for various reasons, and each time I did this it meant that I had to delete and recreate my declarative sync rules for the new MA instance guid. This meant the following:

  1. an awful lot of rework in re-creating sync rules;
  2. long sync times to reapply these new rules to my sync config; and (and here was my clincher)
  3. redefining all my attribute precedence rules for 4 different metaverse object classes across more than half a dozen MAs

After I found myself doing #3 above for the third time, I decided enough was enough, and deleted all of my IAFs in favour of the traditional direct/rules extension option. I did this with great reluctance mainly because so much expectation exists in the maintenance value proposition of declarative sync rules over the alternative, and believe me I didn’t take this decision lightly. Every issue discovery resulted in one compromise after another, eventually resulting in a mix of the two styles where the portal sync rules ended up being shadows of the original design. Resorting to 100% non-declarative in this case appeared my only option, and I certainly had already run out of whatever “discovery” time I had allowed myself, and needed to get some “runs on the board”.

Now back onto my point about precedence … when building a FIM Sync solution I really don’t want to have to go back to the metaverse designer and reset precedence on every object and attribute in the case where I need to delete and recreate an MA. Why does avoiding declarative inbound rules help here? Here’s my argument:

  • with no declarative rules, I can export my MA definition, export the full server configuration, delete and recreate my MA (new instance from exported MA), and then reimport the server export to reset precedence … this seems to work in resetting the attribute precedence back to what it was before the MA deletion; but
  • with declarative rules, when I try the above method, of course after the additional (tedious) steps to recreate all your sync rules in the portal, I find that the precedence doesn’t get reset to what it was (I’ve seen this happen during config migration too many times … maybe this is just a bug, but I think it may actually be a FIM design limitation)

I am still expecting to be able to implement all but probably the Notes MA using outbound flows as declarative OAFs, but as it stands we have an IAF-free zone now. When I look at why this is the case, I am convinced that the success of a declarative sync rule model is entirely dependent on the integrity of your most problematic management agent/connected system – in this case it is Notes and will always be Notes (mainly because the Notes schema seems to be of only mild passing interest to the Notes Client, which allows users to enter multi-values in single value attributes and no end of other schema violations), topped off by the fact that the Notes DN is derived from the multi-valued FullName and ListName properties of the user and group objects respectively, the integrity of which is at the mercy of the ADMIN_P process which tries to play traffic cop with wayward directory entries.

So looking back over the last couple of years’ experience in implementing portal-based sync policy, I am convinced that the reason I’ve been able to barely write a line of rules extension logic is because invariably I was syncing “apples with apples” … i.e. AD to AD, or Identity Broker to AD, or even Notes to Notes. To expect that you can never lift a hand to code sync rules any more just because we’ve entered the FIM R2 age is to be very mistaken indeed.

Posted in FIM (ForeFront Identity Manager) 2010 | Tagged , | Leave a comment

Using FIM for Delegated Access Administration

I have had the pleasure of working on a significant Australian Government ADFS project over the course of the past year, and looking back on this now it occurs to me that maybe there are not to many sites in the world yet that are using FIM in the way it has been used there.  I can’t go into specifics, but I would love to know who else has attempted anything similar and discovered the same thing we did … that it just wouldn’t work without adopting a change event-driven approach with the FIM sync engine.

In this environment, a custom FIM client is used to delegate secure claims management to administrators who they themselves in turn have been delegated the administrative rights.  There are 2 cases where the real-time aspect of the solution have both transitioned from “nice to have” to “essential” requirements, namely

  1. The provisioning of and syncing to user accounts in AD from the FIM Portal (e.g. admin enable/disable); and
  2. The replication of claim objects in FIM to a SQL claims database wired up to ADFS.

It has become such an intrinsic part of the solution now that we’ve recently had to overcome a problem which was preventing accounts created in the FIM portal from immediately being provisioned to AD.  Get this … a test case was failed because 4 out of 10 accounts were not created in AD in less than a minute!  How many of you have had a FIM solution test case failure like that? Happily the problem was quickly tracked down to a FIM workflow being fired .too quickly in succession (over WMI) to be picked up by the mechanism we were using, and a fix has now been made available for site testing.

So when I went to our http://www.fimeventbroker.com website just now this got me thinking … people who look at this site and see “fully automated operations” may not appreciate that the event-driven FIM paradigm its about so much more than this … it’s about the ability to apply the FIM technology to real world challenges that just couldn’t be attempted without the tool.

Maybe this next anecdote is more common though … the “Crouching SharePoint” opportunity I blogged about recently was an example of an organisation looking to use the FIM Portal to replace an old in-house web “white pages” application, where updates were being made directly to AD.  It wasn’t hard for me to make the point that they couldn’t obsolete the legacy application and replace it with something that didn’t apply the changes made in the FIM Portal directly to AD (via the Sync Engine) immediately.  I wonder how many of you out there are starting to make similar observations?

Posted in Event Broker for FIM 2010, FIM (ForeFront Identity Manager) 2010, ILM (Identity Lifecycle Manager) 2007 | Tagged , , , | Leave a comment

Designing and scheduling Housekeeping policy entirely within the FIM

I’ve just added a new post to our FIM Community WIKI on the above topic … hope you too will find this as useful as I have with my past few FIM deployments.

Posted in FIM (ForeFront Identity Manager) 2010 | Tagged | Leave a comment

(Referal) Misspelled Resource Attribute In MPR In FIM 2010 (R2) May Result In Access Denied

(2012-07-18) Misspelled Resource Attribute In MPR In FIM 2010 (R2) May Result In Access Denied.

Thanks for sharing this Jorge … a tiny bug in the default FIM Policy for FIM 2010 R2 😦

Posted in FIM (ForeFront Identity Manager) 2010 | Leave a comment

What is a Directory – a FIM POV

What’s in a word? Let’s break it down …

DIRECT … The way FIM likes its object and attribute flows mapped;
OR … What do you mean, like there’s a happy alternative??? We’re talking declarative here!!!;
Y … Would you even TRY the SQL MA???

There I’ve said it. What the FIM world needs now is for EVERY connected system to look, act and feel like one of these (a directory) even when it isn’t by default. If only …

Well maybe we’re not far away now from just that possibility … So watch this space 😉

Posted in FIM (ForeFront Identity Manager) 2010, ILM (Identity Lifecycle Manager) 2007 | Tagged , , | Leave a comment

Applied EREs but no EAFs?

I had an awkward moment or two today in which my fundamental understanding of outbound FIM declarative sync rules was put to the blow-torch, and for a minute there I thought I was probably had a corrupted Active Directory CS and was facing a complete deletion and reload.  Thankfully I avoided this, but let me take you through the experience in case you find this happening with you one day … and facing that awfully lonely moment when you feel you are in the hot-seat of a live environment and the buck stops with you!  Rest assured that a confidence in your own ability, coupled with a methodical process of elimination will see you through …

Earlier I had successfully set up a heap of pending exports in my AD management agent, but shortly afterwards a dreaded “stopped-database-connectivity” exception on one particular full sync.  I had an overloaded environment to deal with which needed some serious housekeeping … and as soon as I saw the SQL error I knew I could avoid it no longer.  Several service restarts later, as well as reclaiming of disk space and archiving 000s of run histories, I tried re-baselining my sync server, only to find that at the end of it my pending exports had … disappeared!

I retraced my steps and found that I hadn’t missed anything … the pending exports were just gone, even though the metaverse and target CS objects were still just as inconsistent as they were before.  I had renames, updates to title, manager and location, but none of them would manifest themselves as pending updates.  Enter troubleshooting mode …

I located a sample MV user object and observed the ERL contained the correct “applied” ERE, so I tried a full sync preview of the FIM MA connector … nothing!  No “skipped-not-precedent” warnings or anything.  Just no export updates to AD were there at all.  I re-read the section near the end of this reference, but couldn’t see anything missing at all.  I stumbled upon this forum post (and a familiar face therein) but still nothing rang any bells for me.

I decided the ERE must be corrupted, and so I figured I could recreate it specifically for this one user by

  1. updating my SR workflow to include a REMOVE SR before an ADD SR (I do this all the time now … would now  argue this is should be best practice, but that’s another story):
  2. creating a temporary set with just my user; and
  3. creating a temporary set transition MPR to fire my workflow for just this user.

So now I could test and retest with just my user without invalidating my existing sync server baseline – simply by disabling and re-enabling the MPR.  I was sure I would see my EAF appear this time, but alas, nothing.  The ERE was created but went straight to an applied state, and a great big duck egg when it came to the number of pending EAFs.  Not good.  Again I did a preview, but again there was absolutely no clue as to what was wrong … the user/ERL/ERE/SR integrity was all there, and yet no joy.

I was starting to get concerned now.  I had noticed that during the re-baselining process I had set up some pending exports to FIM as well as to AD, and just hadn’t paid much attention to them … just kicked off the export and subsequent DI/DS.  Looking closer now I could see that a heap of DREs had been deleted.  I couldn’t work out why, but I had actually resolved a couple of hundred sync errors by causing a rejoin of incorrectly disconnected AD CS objects, and I figured this was just some sort of delayed clean-up.  Still, I wasn’t comfortable I was on top of this, and checked why we had a need for DREs at all.  I discovered an existence test, and a set of DREs which might have driven an MPR at one stage, but concluded that this was now redundant and I should remove the existence test …

By this time I felt I was clutching at straws really … I couldn’t see anything visible to fault the SR or ERE at all, and yet this DRE thing was enough to make me think that if I perhaps bit the bullet and removed the existence test I would thereby force an update of the SR to import to the sync server.  So I made the change, and followed it by a DI/DS of the FIM MA to bring in the update of the SR.  Given that the first step after doing this is always to run a FS on the FIM MA, I kicked this off and turned my attention to something else.  However, shortly afterwards I turned back to see what I had been hoping for, but had been bracing myself NOT to see … there were the counters for my pending exports to the AD MA happily growing again!  By the time I had run a FS on all my MAs my pending exports were still there, and I could breathe again.

So in summary, if you have problem with a seemingly correct ERE and SR, and you are getting no joy, don’t discount the possibility that it is your SR at fault here.  In the process of restoring some lost joins I had to make a couple of changes to some of the inbound and outbound flows (including a couple of non-declarative attribute flows to restore properties to the metaverse), so it is possible that in the process of doing this I may have invalidated this SR somehow.  Whatever it was, by simply modifying the SR (and I would have faked a change if I hadn’t had the existence test to remove) I was effectively able to flush the entire process out and reload.  Every now and then it seems the ERE process does seem to get into a knot if you move the carpet out from under it.

So my faith is restored, but with the caveat that sometimes FIM is a bit like your PC or laptop … sometimes nothing works like a reboot.  You don’t have to scrub out everything and start again, but sometimes it pays to be prepared to completely reconstruct things selectively.  I guess I had a hunch it would take something like this because I’ve had similar experiences before with a dodgy EAF in a SR … where it didn’t matter how many times I modified the EAF the sync engine seemed to completely ignore my change until I deleted and completely recreated my EAF (not the entire SR).  It is experiences like this which reinforce my opinion that FIM will eventually do the right thing … you just need to be VERY patient at times.

Posted in FIM (ForeFront Identity Manager) 2010 | Tagged , , | Leave a comment

Windows Server 2012 Dynamic Access Control Overview

Please check out this Microsoft Channel 9 presentation, and this walk-through here.

As a FIM identity guy, the reason I’m so interested in this is that it suddenly means that the integrity of user properties changes from being “nice to have” (useful meta data about a person) to a security dependency for the organisation (i.e. you can’t set up a DAC on a folder such as “all users in department x have access to resource y” if you can’t rely on the integrity of the department property :).

Posted in FIM (ForeFront Identity Manager) 2010 | Tagged , | Leave a comment