tag:blogger.com,1999:blog-34457549064619476812024-03-14T05:08:43.456-04:00Data Enclave in the CloudBryan Beecherhttp://www.blogger.com/profile/18073369761415410262noreply@blogger.comBlogger16125tag:blogger.com,1999:blog-3445754906461947681.post-54867191388941490032010-06-28T09:47:00.004-04:002010-07-12T22:08:51.264-04:00Cloud performanceAt times during testing of our ACI Windows prototypes, we have experienced some less that ideal performance results. This has included the time to complete tasks such as log-in, application launch, and data upload/download. In and of itself, this is not a concern and may only require we utilize larger server instances with more available CPU and memory. However what was unexpected was the inconsistency of the results. Two identical instances (cpu, memory, storage, OS, patch level, applications, policies, etc) should provide very similar (good or bad) test results.<br /><br /><a href="http://www.webmetrics.com/">Webmetrics</a> financed <a href="http://www.bitcurrent.com/">Bitcurrent</a> to research a month long study of the performance of five of the major public cloud providers. They recently published their <a href="http://www.webmetrics.com/landingpage/bitcurrentcloud/The_Performance_of_Clouds_Complete.pdf">results</a>.<br /><br />While it doesn't explain the inconsistent results, it does help to understand the strengths and weaknesses of each vendor's approach and architecture. This kind of information, when considered with cost and management capabilities, will help to determine the right vendor for a given use case. As to the consistency concern, we'll need to begin to chart activity to assess what is the most likely source of the issue to determine the best approach for a solution.Stuart Hutchingshttp://www.blogger.com/profile/16909683497699347297noreply@blogger.com3tag:blogger.com,1999:blog-3445754906461947681.post-41987613580304338782010-05-14T15:30:00.004-04:002010-05-18T14:20:33.542-04:00Windows ACI: Stu's weepingThat we will utilize Microsoft's <a href="http://en.wikipedia.org/wiki/Active_Directory">Active Directory</a> (AD) to manage our server instances and users (accounts and profiles) has been the assumption from the beginning of this experiment. How best to accomplish this is a very challenging question. Hence the label: "<span style="font-weight:bold;">Stu's weeping</span>".<br /><br />There are several reasons for using AD as part of our management strategy. First, it facilitates a single sign-on (SSO) solution. Without it, users would first need to log into the TS Gateway and then to their individual Terminal Server instance. Second it allows us to centrally manage user accounts, user profiles, and user and server configurations via group policies. And it permits us to leverage our existing management practices and procedures. <br /><br />If we moved to production and on average only supported 2-3 TS instances and 5-25 user accounts, well then this would probably be overkill. However, we believe that if successful, we will likely be supporting many TS server instances with potentially hundreds of user accounts.<br /><br />We've spent the better part of the last several weeks working to figure out whether the better approach is to create an AD structure within the cloud or try and join our cloud based servers to the University's test AD forest, under the ICPSR Organizational Unit (OU). We believe the better route is to join them to our on premise AD but it has been more difficult than anticipated for a variety of reasons which I'll explain in the next post.Stuart Hutchingshttp://www.blogger.com/profile/16909683497699347297noreply@blogger.com0tag:blogger.com,1999:blog-3445754906461947681.post-3566331403448611642010-05-12T13:50:00.004-04:002010-05-14T15:31:21.295-04:00Windows ACI: Terminal Server GatewayI've elected to setup a Terminal Server Gateway as the entry point for access to the ACIs. This server role was introduced with Windows Server 2008 and provides several benefits that will help with the management and security of the ACIs. First and foremost, a TS Gateway provides a means to encapsulate the Remote Desktop Protocol (RDP) traffic through an SSL tunnel using HTTPS over port 443. In turn the TS Gateway facilitates a standard RDP connection, over port 3389, to the requested resource. This allows us to configure the firewall of our cloud service provider account and the individual server instances to isolate the ACIs to only be reachable via the TS Gateway and a few IP addresses here at ICPSR. The TS Gateway also provides for the implementation of client and resource access policies to control which resources can be accessed and by whom, therefore we can restrict access to individual ACIs to strictly the users identified in the associated research request. Additionally, the TS Gateway can be configured to utilize a Network Access Protection (NAP) health policy. This technology allows for us to define client "health" condition in order to be permitted to connect, for example that the client's firewall is enabled, current for operating system security patches, has anti-virus software.<br /><br />Implementation of this server role should help to simplify support in a production scenario where there could be a large number of ACIs active at any given time.Stuart Hutchingshttp://www.blogger.com/profile/16909683497699347297noreply@blogger.com2tag:blogger.com,1999:blog-3445754906461947681.post-22019170154346091362010-04-15T15:54:00.005-04:002010-04-15T22:16:54.282-04:00Windows ACI: Terminal ServerOur 'concept' Windows ACI is using Windows Terminal Services. This technology, renamed Remote Desktop Services with the release of Windows Server 2008 R2, has been available as a service/role of the Windows Server operating systems since NT. Its maturity has allowed for a great deal of documentation to be developed about managing it, the quirks, and its strengths and weaknesses. This provided us a sort of head start to assessing the challenges of leveraging it to delivery the robust environment we have envisioned. We are not using the R2 version, hence we continue to refer to it as Terminal Server, due to the fact that Server 2008 R2 is not supported (at least currently) by the major cloud computing service providers. That is fine as it does not appear that any of the enhancements or features provided with the R2 release offer any additional value to our project.<br /><br />Terminal Server is Microsoft's solution for server based computing (SBC). I find that ironic given it was Microsoft's popularization of desktop computing that lead to it replacing the original SBC... the mainframe. Using SBC allows us to provision access to requested research data by instead focusing on providing the working environment for the researchers to perform their analysis. Researchers can remotely access a server we create for them. Once logged in, they are provided a standard Windows-style desktop with the tools and applications they require to perform their work, using a copy of the requested data. The remote session or connection to the server will appear as just another 'window' on their own desktop or laptop computer. They will not be able to move items to/from the remote server to their computer, but will be able to interact with all the applications installed on the server as they normally would.<br /><br />The question of course is can we satisfy our security requirements while giving the researcher a rich environment with good performance?Stuart Hutchingshttp://www.blogger.com/profile/16909683497699347297noreply@blogger.com0tag:blogger.com,1999:blog-3445754906461947681.post-74650363249175262982010-04-14T13:38:00.005-04:002010-04-14T14:16:50.612-04:00Windows ACI: Concept or Prototype?I'm not certain which descriptor fits but regardless we have in place a test version of our Windows ACI. At this point its a pretty basic setup of a Windows 2008 server configured to perform the <a href="http://en.wikipedia.org/wiki/Remote_Desktop_Services#Terminal_Server">Terminal Server</a> role. I've installed three of the statistical software packages used here at <a href="http://www.icpsr.umich.edu/">ICPSR</a> along with a few productivity applications such as Word and Excel. Test data has been placed on the device. We are using a mandatory user profile and researchers will connect to their ACIs via a separate Windows 2008 server configured as a <a href="http://en.wikipedia.org/wiki/Remote_Desktop_Services#Terminal_Services_Gateway">Terminal Server Gateway</a>. Its our intent that the TS Gateway will be the only Internet accessible device but we are still working out all the assorted details.<br /><br />Felicia is current giving it a test spin and I am developing a questionnaire intended to capture observations from testers of the system. This document will likely evolve as we proceed.<br /><br />I think concept is probably more accurate as this test ACI is extremely rough around the edges and will most assuredly change a great deal before we begin piloting.Stuart Hutchingshttp://www.blogger.com/profile/16909683497699347297noreply@blogger.com1tag:blogger.com,1999:blog-3445754906461947681.post-74775573925900641942010-04-13T15:46:00.003-04:002010-04-13T15:57:45.527-04:00What we've been doing during our spring break<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxSuhcr8aYaP7bgAty2UHkTUnGfpJ-A2QBTpIlgBw3RTdJJN3hsEHsqu1OUznNM2j5wbKPje6BxUSjGFAIPttZv-yTcNIYT56FsSkYQfQXBva6dLmTqJNJEcgSinadg_prF5TXsKf2i9gR/s1600/ACI+systems.png"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 400px; height: 265px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxSuhcr8aYaP7bgAty2UHkTUnGfpJ-A2QBTpIlgBw3RTdJJN3hsEHsqu1OUznNM2j5wbKPje6BxUSjGFAIPttZv-yTcNIYT56FsSkYQfQXBva6dLmTqJNJEcgSinadg_prF5TXsKf2i9gR/s400/ACI+systems.png" alt="" id="BLOGGER_PHOTO_ID_5459710840457269618" border="0" /></a><br />The blog has been quiet, but the team has been busy working on components of the system.<br /><br />Steve has built a prototype of our ACI Chooser, the web application that enables a researcher to link an executed restricted-use contract with ACI preferences, such as operating system, stat package, etc.<br /><br />Steve has also built an early version of the ACI Launcher. This tool uses preferences set by the researcher, and builds an AWS instance to support the research. The tool also enforces ICPSR's license and access controls related to the research data.<br /><br />Unlike the world of Linux where it has been pretty easy to define and launch a locked-down, managed AWS instance, the world of Windows has been more difficult. Since joining ICPSR earlier this year, Stu has made great progress building an early version of a Windows-flavored ACI to go along with the Linux-flavored one Steve has already built. Stu has also been researching the best way to tie our ACIs with necessary Windows infrastructure, like Active Directory. We often see Stu weeping, and we are not sure if it is with joy or frustration.Bryan Beecherhttp://www.blogger.com/profile/18073369761415410262noreply@blogger.com0tag:blogger.com,1999:blog-3445754906461947681.post-78579265146366995882010-03-08T14:26:00.003-05:002010-03-08T15:12:56.534-05:00The Windows ACI: Decisions...There are a number of technologies available for developing and managing virtual Windows computing environments. We decided to focus initially on products available from Microsoft and will then compare against the benefits of alternatives such as Citrix and VMWare.<br /><br />The design we are considering would leverage the University of Michigan Active Directory (AD) infrastructure for user and computer account management. Residing in the cloud would be a public facing Remote Desktop (RD) Gateway through which users of this system would access their research group's ACI. Each ACI would be a Remote Desktop Services (RDS) server, formerly known as Terminal Server, configured based on the information provided by their PI through the ACI Chooser. The RD Gateway would validate the users credentials and redirect them to their assigned ACI.<br /><br />This approach allows us to managed a single entry point for access to the cloud based resources and utilizing AD permits policy based management of the configuration of the Windows ACIs and associated user accounts. However a requirement of this approach is that a connection must be permitted to the UM network for the RD Gateway residing in the cloud. In the short term we are working with University engineers to permit this connection. Long term we will evaluate alternative approaches to facilitate this design, such as: using dedicated VPN appliance gateways to establish the connection; foregoing the connection to the UM AD and instead create an AD infrastructure in the cloud; and determine whether we should setup a Virtual Private Cloud (VPC) in the EC2 as an added layer for security. Right now our focus is on developing a working prototype RDS that can be tested by our PI and the ICPSR staff.<br /><br />My next few blog posts will drill down in the merits of this design, its components and their roles, and the decisions required if we were to proceed with a production implementation of this approach.Stuart Hutchingshttp://www.blogger.com/profile/16909683497699347297noreply@blogger.com1tag:blogger.com,1999:blog-3445754906461947681.post-6183909463264109342010-02-23T14:40:00.003-05:002010-02-23T15:16:49.544-05:00Customizing an ACIOne of the tasks associated with bring up an ACI is configuring it for a particular group of researchers. There are (at least) three steps that have to be taken:<div><ul><li>create user accounts</li><li>make required statistical software available</li><li>copy the specified dataset(s) to the ACI</li></ul>My plan was to create an archive containing the following pieces:</div><div><ul><li>a list of users for whom accounts needed to be created</li><li>a list of statistical packages to be activated</li><li>an inner archive containing the dataset(s)</li><li>a script which would use the two lists to create accounts and activate software, and then expand the dataset archive </li></ul>The ACI creation script would then pass that archive to the instance via the '--user-data-file' option of the 'ec2-run-instances' script. At first run, the ACI will fetch the user data, unpack it, and execute the included script.</div><div><br /></div><div>Unfortunately, there is a 16,384-byte limit on the size of the user data, which meant that this approach was not practical.</div><div><br /></div><div>My current idea is to create two archives. The main archive is as described above; the second is a bootstrap archive that will contain:</div><div><ul><li>a pointer to a location from which the main archive should be fetched</li><li>a set of credentials that can be used to do that fetching</li><li>a script that can use those credentials to fetch the specified archive</li></ul>The bootstrap archive will be small enough to be passed to the instance via the '--user-data-file' option of the 'ec2-run-instances' script. At first run, the ACI will fetch the bootstrap archive, unpack it, and execute its included script, fetching the main archive. Once the main archive has been fetched, the credentials will be destroyed, the just-fetched archive will be unpacked, and its included script executed.</div><div><br /></div><div>I'm working on that process now.</div>Steve Burlinghttp://www.blogger.com/profile/10470280530770378566noreply@blogger.com0tag:blogger.com,1999:blog-3445754906461947681.post-26843130438672398072009-12-03T10:53:00.012-05:002009-12-03T11:15:45.749-05:00Clarifying the CloudIn vacation weeks and those that follow, the group working on the Data in the Cloud do not usually meet. But we are always working on something.... <div><br /></div><div>A couple things of interest for this project include (1) what other services at ICPSR we have put in the cloud and (2) what resources Steve Burling (our 70% man ....we call him that because 70% of his time is dedicated to the project....it isn't meant to imply that he is only 70% of a man ...I think) are using to build the applications and images for this project. </div><div><br /></div><div>First, a link to Bryan's <a href="http://techaticpsr.blogspot.com/2009/11/icpsr-and-cloud.html">ICPSR Tech blog</a> about the other ICPSR services running in the cloud. On-going experiences with these service will be useful in monitoring service performance primarily. Bryan will keep us informed about problems or triumphs so we can evaluate other sources of information that will inform what we do. Synergy is fantastic, don't you think? </div><div><br /></div><div>Second, Steve reports that these are the specific resources he has been using to build our ACI (remember this is our fancy acronym for the customized analytic space we are creating for users) in the cloud. <span style="mso-spacerun:yes">The primary source is Amazon's documentation on the </span><a href="http://aws.amazon.com/documentation/">AWS service</a>. In particular from there, he has used the <a href="http://docs.amazonwebservices.com/AWSEC2/latest/GettingStartedGuide/">Getting Started Guide</a>, <span style="mso-spacerun:yes">the </span><a href="http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/">Developer Guide</a>, <span style="mso-spacerun:yes">and the </span><a href="http://docs.amazonwebservices.com/AWSEC2/latest/CommandLineReference/">Command Line Reference</a>.</div> <p class="MsoPlainText"><o:p> </o:p></p> <p class="MsoPlainText"><span style="mso-spacerun:yes"> </span></p> <div> </div>Felicia LeClerehttp://www.blogger.com/profile/06074946877745616936noreply@blogger.com1tag:blogger.com,1999:blog-3445754906461947681.post-27190488681917466702009-11-17T19:11:00.003-05:002009-11-17T19:32:08.831-05:00ACI Chooser --- what a terrible nameWe met today to decide how to put together the place where we gather information from the contract holder so that we can configure their space. I think that's what Bryan means by "Chooser" ---ugh. The inelegance of tech speak is legendary but this one is a real winner. The restricted contracting portal will already provide us with a great deal of information about the primary investigator that we can "populate" (another elegant use of a term) the forms with but we need to confirm the identity of the primary investigator and his/her research team. We will do that with an entry screen that confirms the identities through a MyData registration on the PI and a list of research team members and their emails. We then need to identify their choice of stat package and OS (the on-going argument between Steve Burling --our 70% man and Bryan is that almost 90% of the folks coming through this will choose a Windows environment as an OS and thus it makes both the OS choice and Burling an anachronism). We will also bring information from the RCS to determine the structure of the security for the data system the PI and team are accessing. <div><br /></div><div>The things left to research are (1) how to bring users through to a firewalled entry to the site so that they do not have find and identify their IP addresses; (2) whether and how the PI would set up read/right permissions within the workspace for their affiliated researchers; (3) whether to allow content to be uploaded to the ACI without scrutiny. This research belongs to Steve. </div>Felicia LeClerehttp://www.blogger.com/profile/06074946877745616936noreply@blogger.com0tag:blogger.com,1999:blog-3445754906461947681.post-29445509750707946642009-11-05T09:50:00.006-05:002009-11-05T10:08:03.736-05:00Project Deliverables - Software<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRwIIlJa8P7-zVGNJHUa7DRigcDc-KZ7XKqjahdAA7yXJnvFPeGcl-l2LyCqCPiI2W7zk80L9VR-IpS1nKSPfyT7WaaymaJ3UhFPKLlVXQJXV8qPIlChi9cWYLM9dMgBZTQ9KgKrvUGVqe/s1600-h/Simple_Cloud_API_1253755357_0.jpg"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 266px; height: 200px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRwIIlJa8P7-zVGNJHUa7DRigcDc-KZ7XKqjahdAA7yXJnvFPeGcl-l2LyCqCPiI2W7zk80L9VR-IpS1nKSPfyT7WaaymaJ3UhFPKLlVXQJXV8qPIlChi9cWYLM9dMgBZTQ9KgKrvUGVqe/s320/Simple_Cloud_API_1253755357_0.jpg" alt="" id="BLOGGER_PHOTO_ID_5400635414879602962" border="0" /></a>So, what do we need to build to start our science experiment of building a secure data enclave in a public computing utility cloud?<br /><br />Felicia and I have been talking about the list of software deliverables for the project. These flow from the high-level architecture diagram from my earlier post; if the high-level diagram is at the 10,000 foot level, then these items are at the 1,000 foot level. (And the Cloud Developer we are recruiting will take them down to the 1 foot level.)<br /><br />I think we'll need six systems:<br /><ol><li>ACI Chooser - This is a public-facing webapp where the researcher selects options to configure the ACI. This might include the desired platform (Windows), desired analytic software (SAS), the allowable users (say a subset of people on the restricted-use contract), a zipped bundle of miscellaneous tools the researcher wants pre-loaded on the ACI, etc.</li><li>ACI Pre-Launcher - This is an ICPSR-facing (command-line?) utility for taking configuration information supplied by the researcher + restrictions associated with the dataset(s), and launching a customized ACI in the cloud for the researcher</li><li>ACI Post-Launcher - This is also an ICPSR-facing utility for customizing the ACI, but performs post-launch configuration of the instance. We may decide to couple #2 and #3 into a single tool.</li><li>ACI Watcher - This is a tool that monitors the availability and performance of each ACI. If the ACI is unavailable or sluggish, this will tell us.</li><li>ACI Dashboard - This is a tool that aggregates views of all ACIs, giving an overall view of the cloud provider and. Perhaps this would be public-facing if properly anonymous?</li><li>ACI Waste Manager - This tool securely and completely cleans up an ACI once the research has been complete. This tool will have done its job if there is no trace of the research or the data left in the cloud once the ACI has been terminated.</li></ol>There are undoubtedly other items we'll also need, but this is a good starter list.Bryan Beecherhttp://www.blogger.com/profile/18073369761415410262noreply@blogger.com1tag:blogger.com,1999:blog-3445754906461947681.post-56816405770868836032009-11-02T14:00:00.000-05:002009-11-02T14:00:00.902-05:00High-Level System Architecture<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQOvQs0fGY0SiY4fRbYVFcxJ_thH3RcI3l1KjqTYTR4MLV91yd2KLYbWuOnaNwNarJwbnC9KpL1C2VaP-S1Oz294RER1BeeIr1Lb8rU08YrXRdsq0wav8TZE-42tyMVzB9sZCx9tpYVmaA/s1600-h/ACI.png"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 400px; height: 299px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQOvQs0fGY0SiY4fRbYVFcxJ_thH3RcI3l1KjqTYTR4MLV91yd2KLYbWuOnaNwNarJwbnC9KpL1C2VaP-S1Oz294RER1BeeIr1Lb8rU08YrXRdsq0wav8TZE-42tyMVzB9sZCx9tpYVmaA/s400/ACI.png" alt="" id="BLOGGER_PHOTO_ID_5398449870085840274" border="0" /></a>[ click the image at the left to navigate to a larger version ]<br /><br />Our high-level architecture for our Enclave in the Cloud starts with a researcher who is interested in using restricted-access data.<br /><br />Step #1: The researcher uses ICPSR's contracting portal (which is nearing completion) to submit a request for access. This portal pulls together information from the researcher (who will be using the data, what's the research plan, institutional approval) and information from the dataset (licensing terms, data protection requirements).<br /><br />Step #2: ICPSR reviews the application, and if everything is in order, approves access to the data.<br /><br />Step #3: The researcher uses a (yet to be built) portal to configure choices about access: platform (Linux or Windows), required statistical software, etc. This portal also pulls in requirements from the contracting system which may influence available options.<br /><br />Step #4: ICPSR uses this configuration as a template to a (yet to be built) utility that launches a virtual machine in the cloud. This system - an Analytic Computing Instance - contains all of data and software that the researcher or research team will need, and is protected by firewalls and host-level security to prevent unauthorized access.<br /><br />Step #5: The researchers download a copy of the Citrix client (if the ACI platform is Windows). This is the tool they will (likely) need to use to login, and which can restrict functions such as cut and paste between the ACI and the local desktop. We'd like to make this download and install as easy as downloading Acrobat Reader.<br /><br />Step #6: Research happens, and while it is happening....<br /><br />Step #7: ICPSR monitors both the cloud provider and the ACI for performance and security. Some of the tools we'll use for this already exist because ICPSR uses Amazon's cloud for several extant portals and systems.<br /><br />Step #8: The research has concluded and ICPSR destroys the ACI in a secure manner such that no trace of the research or the data lingers in the cloud.Bryan Beecherhttp://www.blogger.com/profile/18073369761415410262noreply@blogger.com0tag:blogger.com,1999:blog-3445754906461947681.post-81588881923994164672009-10-26T17:12:00.000-04:002009-10-26T17:23:34.074-04:00Things we are thinking aboutIn the most recent meeting of the "Cloud" team, 2 issues came up that need further research and set us into some of the gray areas of cloud computing. The first is software licensing in the cloud. Our goal is to provision the analytic instances with the software of choice (within reason) for analysts. The question is whether ICPSR's licenses apply to "virtual space" if we are in fact renting that space from Amazon. We are pursuing it. The second question that requires some experiment is whether those people who want to access the "Cloud" will come in via Windows Remote Desktop or our Citrix server. The tradeoff is security vs ease of use. Remote Desktop is embedded in Windows software whereas users would need to get access to the Citrix system with some software. We are going to test this as we move forward. These two issues will be documented by the team once we have a clearer notion of how this will work.Felicia LeClerehttp://www.blogger.com/profile/06074946877745616936noreply@blogger.com0tag:blogger.com,1999:blog-3445754906461947681.post-50372403135972783432009-10-21T13:21:00.000-04:002009-10-21T14:15:36.382-04:00First meeting summary and a linkWe had our first substantive meeting of the "Cloud" team on 10/13/09 to try to start the engine on this project. I will let the technical people describe how we are to approach the tech side but I think it is useful to summarize the naive aspects of this experiment in order to keep some of the narrative in non-tech speak.<br /><br />The pieces that need to be constructed for this to work are (1) the web app that gathers data from the user about how they want their analytic instance to look ---i.e. what software, etc. and (2) the "image" ---that is the application that instructs the cloud how to behave. We will develop the UNIX/Windows side of the image independently.<br /><br />We also need to gather data set and user information from the Restricted Contracting System to both set the security conditions for the data and to pre-populate some of the web forms.<br /><br />The other issues we discussed were (1) dealing with licensing issues for software that will be used in the analytic instances and (2) how the cloud data will be backed up.<br /><br />Bryan provided us with an interesting article on <a href="http://cseweb.ucsd.edu/~savage/papers/CCS09.pdf">cloud security. </a> The great thing about this project is it provides so many wonderful metaphors (which social science usually does not frankly). This article is entitled something like "Hey You, Get Off Of My Cloud" If you are old enough ---you will know that it is a Rolling Stones song from 1965. So, we now also have a theme song for the grant as well. Who can ask for more?Felicia LeClerehttp://www.blogger.com/profile/06074946877745616936noreply@blogger.com0tag:blogger.com,1999:blog-3445754906461947681.post-82271393867138628272009-10-09T10:54:00.000-04:002009-10-09T11:34:44.766-04:00Text of the Challenge Grant<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXQmHdKcUGYWdSP1Lut5oKoB0JzJornojvs3x_TQaALAJKMuBvqihQEks_BHKPrbGAMIkmIU1zflKgPxbo4WnjGJ-i0KA5VNGzfGgvAWAq2ObUvBtiN27uKlCM0zB4ybsDgbFq2LX8Ckv5/s1600-h/cloud.jpg"><img id="BLOGGER_PHOTO_ID_5390623104985445986" style="FLOAT: right; MARGIN: 0px 0px 10px 10px; WIDTH: 309px; CURSOR: hand; HEIGHT: 400px" alt="" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXQmHdKcUGYWdSP1Lut5oKoB0JzJornojvs3x_TQaALAJKMuBvqihQEks_BHKPrbGAMIkmIU1zflKgPxbo4WnjGJ-i0KA5VNGzfGgvAWAq2ObUvBtiN27uKlCM0zB4ybsDgbFq2LX8Ckv5/s400/cloud.jpg" border="0" /></a><br /><br /><div><div></div><div></div><div><div><div>We will have our first design meeting next Tuesday so it makes sense to post the text of the proposal as a shared document. This version contains the <a href="http://docs.google.com/Doc?docid=0ARbLNV7t3UtlZGo2d3N4cV8yM3pobXAzcGNr&hl=en">narrative</a> for the grant. Two useful components of the document will guide the development. The first is Bryan's stick figure conceptualization of how we will put data in the cloud. The "Analytical Computing Instance" is a compromise name for what the user will configure and see in the cloud. The "instance" is terminology used to describe how clouds are used but seems a strange name. It may change as we get further along.<br /><br /></div><br /><br /><div>The second part of the document that will guide our work of course is the schedule. The first part of the grant is time is for setting up design specifications. The two components to be built are the ACI compiler and the web user interface. </div><br /><div></div><br /><div></div><br /><div><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisBEqJopXzLExqY2UqaawdiBDG3J6DbGHWUGBDwZX_3qB1fr6m8h0gfmgZ86i5jFnhwvAvXNV7EYlozR-gE61ccRJC9DbCdTjIL-jIh-qAHM7FxQK7iuv4re1mHsK2ec5jM8wVCXTgJOAm/s1600-h/timetable.jpg"><img id="BLOGGER_PHOTO_ID_5390621175898756210" style="WIDTH: 358px; CURSOR: hand; HEIGHT: 229px" alt="" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisBEqJopXzLExqY2UqaawdiBDG3J6DbGHWUGBDwZX_3qB1fr6m8h0gfmgZ86i5jFnhwvAvXNV7EYlozR-gE61ccRJC9DbCdTjIL-jIh-qAHM7FxQK7iuv4re1mHsK2ec5jM8wVCXTgJOAm/s400/timetable.jpg" border="0" /></a><br /><br /></div><br /><br /><br /><br /><br /><div></div></div></div></div>Felicia LeClerehttp://www.blogger.com/profile/06074946877745616936noreply@blogger.com0tag:blogger.com,1999:blog-3445754906461947681.post-88509457737845026572009-10-01T10:19:00.000-04:002009-10-01T10:58:05.571-04:00Putting Confidential Data in the CloudsThis blog is designed to chronicle progress on a new project at the <a href="http://www.icpsr.umich.edu/">Inter-University Consortium for Political and Social Research </a>funded by the <a href="http://www.nih.gov/icd/od/">National Institutes of Health, Office of the Director </a>through the <a href="http://grants.nih.gov/grants/funding/challenge_award/">Challenge Grant Program</a>. The primary goal of the grant is to test whether confidential data, that is data distributed under license or contract, can be effectively and safely disseminated via the <a href="http://en.wikipedia.org/wiki/Cloud_computing">computing cloud</a>. Currently data licenses and contracts put the burden of securing data files on the user. This often involves elaborate data security plans that may involve purchasing new technology or securing existing networks and machinery. This grant is to test whether we can dynamically configure temporary computing environments in the Cloud that will provider users with a secure environment in which to analyze confidential data. We will be building both the application that provisions this analytic instance and the web interface to help users navigate it. The experimental part of this project is to test cloud security and analyst's reaction to more distant analytic environments where they have less control. We have partners in this endeavor to help us recruit users to test the applications we build. They are at the <a href="http://psidonline.isr.umich.edu/">Panel Study of Income Dynamics </a>and the <a href="http://www.lasurvey.rand.org/">Los Angeles Family and Neighborhood Survey</a>.<br /><br />This is the first day of the project!Felicia LeClerehttp://www.blogger.com/profile/06074946877745616936noreply@blogger.com0