[Gmod-schema] Re: example data for comparative genomics analysis

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[Gmod-schema] Re: example data for comparative genomics analysis

Don Gilbert

Kara,

It happens I've got a set of OrthoMCL results for the 22 org-genome
sets* from mods and ensembl that Erik Sonnhammer used for Inparanoid this winter.
I've run the recipr. blastp of these (using Teragrid, another possible
MOD resource), then ran the orthomcl on them.   I've been spending
some time recently putting these into a version of chado (which however
is running in mysql).  It isn't quite complete but it is my hope to share
these all with anyone wanting them, thru eugenes.org.

The source data and blastp results are now at
ftp://eugenes.org/eugenes/proteomes/   (blastp results inside blout/)
I'll take a look at state of chado-mysql database and probably
put a dump of that for public access in same eugenes/ folder.

I'd be very happy to work with you and others interested in
a collaborative attack on this relatively large data set.  Just
managing all the gene/prot. IDs to get them straight (Ensembl uses
one set, NCBI another, Mods another ..) is a chore.  UniProt seems
to be the rosetta stone here.

-- Don

* source: http://inparanoid.cgb.ki.se/download/current/sequences/original/   May 2005
Proteomes of model organisms contributed to InParanoid project by genome database
projects.



-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

[Gmod-schema] Re: example data for comparative genomics analysis

Aaron J. Mackey

see also: orthomcl.cbil.upenn.edu

for a public clustering of 55 predominantly-eukaryotic genomes.

-Aaron

On Oct 20, 2005, at 5:56 PM, Don Gilbert wrote:

>
> Kara,
>
> It happens I've got a set of OrthoMCL results for the 22 org-genome
> sets* from mods and ensembl that Erik Sonnhammer used for  
> Inparanoid this winter.
> I've run the recipr. blastp of these (using Teragrid, another possible
> MOD resource), then ran the orthomcl on them.   I've been spending
> some time recently putting these into a version of chado (which  
> however
> is running in mysql).  It isn't quite complete but it is my hope to  
> share
> these all with anyone wanting them, thru eugenes.org.
>
> The source data and blastp results are now at
> ftp://eugenes.org/eugenes/proteomes/   (blastp results inside blout/)
> I'll take a look at state of chado-mysql database and probably
> put a dump of that for public access in same eugenes/ folder.
>
> I'd be very happy to work with you and others interested in
> a collaborative attack on this relatively large data set.  Just
> managing all the gene/prot. IDs to get them straight (Ensembl uses
> one set, NCBI another, Mods another ..) is a chore.  UniProt seems
> to be the rosetta stone here.
>
> -- Don
>
> * source: http://inparanoid.cgb.ki.se/download/current/sequences/ 
> original/   May 2005
> Proteomes of model organisms contributed to InParanoid project by  
> genome database
> projects.
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by:
> Power Architecture Resource Center: Free content, downloads,  
> discussions,
> and more. http://solutions.newsforge.com/ibmarch.tmpl
> _______________________________________________
> Gmod-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-devel
>

--
Aaron J. Mackey, Ph.D.
Project Manager, ApiDB Bioinformatics Resource Center
Penn Genomics Institute, University of Pennsylvania
email:  [hidden email]
office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI)
fax:    215-746-6697
postal: Penn Genomics Institute
         Goddard Labs 212
         415 S. University Avenue
         Philadelphia, PA  19104-6017




-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

[Gmod-schema] Re: example data for comparative genomics analysis

Kara Dolinski
Hi Aaron and Don,

Thanks for the info.  What I'm really looking for are examples of  
gmod/chado databases that are storing these types of comparative  
genomics analyses done on multiple species, just to look at some  
examples of the schema in action for these types of data.   Does  
OrthoMCL use a gmod/chado schema backend?

Thanks,
Kara
PS.  The OrthoMCL site is really nice, and the code available via the  
web site was nice and easy to get running on our system.   :)

On Oct 20, 2005, at 6:19 PM, Aaron J. Mackey wrote:

>
> see also: orthomcl.cbil.upenn.edu
>
> for a public clustering of 55 predominantly-eukaryotic genomes.
>
> -Aaron
>
> On Oct 20, 2005, at 5:56 PM, Don Gilbert wrote:
>
>
>>
>> Kara,
>>
>> It happens I've got a set of OrthoMCL results for the 22 org-genome
>> sets* from mods and ensembl that Erik Sonnhammer used for  
>> Inparanoid this winter.
>> I've run the recipr. blastp of these (using Teragrid, another  
>> possible
>> MOD resource), then ran the orthomcl on them.   I've been spending
>> some time recently putting these into a version of chado (which  
>> however
>> is running in mysql).  It isn't quite complete but it is my hope  
>> to share
>> these all with anyone wanting them, thru eugenes.org.
>>
>> The source data and blastp results are now at
>> ftp://eugenes.org/eugenes/proteomes/   (blastp results inside blout/)
>> I'll take a look at state of chado-mys



-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: [Gmod-schema] Re: example data for comparative genomics analysis

Kara Dolinski
In reply to this post by Aaron J. Mackey
Oops, sorry, after reading the docs on the OrthoMCL site, I see that  
it is using the GUS schema, so never mind.  :}

On Oct 20, 2005, at 6:19 PM, Aaron J. Mackey wrote:

>
> see also: orthomcl.cbil.upenn.edu
>
> for a public clustering of 55 predominantly-eukaryotic genomes.
>
> -Aaron
>
> On Oct 20, 2005, at 5:56 PM, Don Gilbert wrote:
>
>
>>
>> Kara,
>>
>> It happens I've got a set of OrthoMCL results for the 22 org-genome
>> sets* from mods and ensembl that Erik Sonnhammer used for  
>> Inparanoid this winter.
>> I've run the recipr. blastp of these (using Teragrid, another  
>> possible
>> MOD resource), then ran the orthomcl on them.   I've been spending
>> some time recently putting these into a version of chado (which  
>> however
>> is running in mysql).  It isn't quite complete but it is my hope  
>> to share
>> these all with anyone wanting them, thru eugenes.org.
>>
>> The source data and blastp results are now at
>> ftp://eugenes.org/eugenes/proteomes/   (blastp results inside blout/)
>> I'll take a look at state of chado-mysql database and probably
>> put a dump of that for public access in same eugenes/ folder.
>>
>> I'd be very happy to work with you and others interested in
>> a collaborative attack on this relatively large data set.  Just
>> managing all the gene/prot. IDs to get them straight (Ensembl uses
>> one set, NCBI another, Mods another ..) is a chore.  UniProt seems
>> to be the rosetta stone here.
>>
>> -- Don
>>
>> * source: http://inparanoid.cgb.ki.se/download/current/sequences/ 
>> original/   May 2005
>> Proteomes of model organisms contributed to InParanoid project by  
>> genome database
>> projects.
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by:
>> Power Architecture Resource Center: Free content, downloads,  
>> discussions,
>> and more. http://solutions.newsforge.com/ibmarch.tmpl
>> _______________________________________________
>> Gmod-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-devel
>>
>>
>
> --
> Aaron J. Mackey, Ph.D.
> Project Manager, ApiDB Bioinformatics Resource Center
> Penn Genomics Institute, University of Pennsylvania
> email:  [hidden email]
> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI)
> fax:    215-746-6697
> postal: Penn Genomics Institute
>         Goddard Labs 212
>         415 S. University Avenue
>         Philadelphia, PA  19104-6017
>
>



-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: [Gmod-schema] Re: example data for comparative genomics analysis

Aaron J. Mackey
Actually, the OrthoMCL DB site is not (where did it say that?), but  
rather a custom, lightweight schema.

-Aaron

On Oct 21, 2005, at 9:22 AM, Kara Dolinski wrote:

> Oops, sorry, after reading the docs on the OrthoMCL site, I see  
> that it is using the GUS schema, so never mind.  :}
>
> On Oct 20, 2005, at 6:19 PM, Aaron J. Mackey wrote:
>
>
>>
>> see also: orthomcl.cbil.upenn.edu
>>
>> for a public clustering of 55 predominantly-eukaryotic genomes.
>>
>> -Aaron
>>
>> On Oct 20, 2005, at 5:56 PM, Don Gilbert wrote:
>>
>>
>>
>>>
>>> Kara,
>>>
>>> It happens I've got a set of OrthoMCL results for the 22 org-genome
>>> sets* from mods and ensembl that Erik Sonnhammer used for  
>>> Inparanoid this winter.
>>> I've run the recipr. blastp of these (using Teragrid, another  
>>> possible
>>> MOD resource), then ran the orthomcl on them.   I've been spending
>>> some time recently putting these into a version of chado (which  
>>> however
>>> is running in mysql).  It isn't quite complete but it is my hope  
>>> to share
>>> these all with anyone wanting them, thru eugenes.org.
>>>
>>> The source data and blastp results are now at
>>> ftp://eugenes.org/eugenes/proteomes/   (blastp results inside  
>>> blout/)
>>> I'll take a look at state of chado-mysql database and probably
>>> put a dump of that for public access in same eugenes/ folder.
>>>
>>> I'd be very happy to work with you and others interested in
>>> a collaborative attack on this relatively large data set.  Just
>>> managing all the gene/prot. IDs to get them straight (Ensembl uses
>>> one set, NCBI another, Mods another ..) is a chore.  UniProt seems
>>> to be the rosetta stone here.
>>>
>>> -- Don
>>>
>>> * source: http://inparanoid.cgb.ki.se/download/current/sequences/ 
>>> original/   May 2005
>>> Proteomes of model organisms contributed to InParanoid project by  
>>> genome database
>>> projects.
>>>
>>>
>>>
>>> -------------------------------------------------------
>>> This SF.Net email is sponsored by:
>>> Power Architecture Resource Center: Free content, downloads,  
>>> discussions,
>>> and more. http://solutions.newsforge.com/ibmarch.tmpl
>>> _______________________________________________
>>> Gmod-devel mailing list
>>> [hidden email]
>>> https://lists.sourceforge.net/lists/listinfo/gmod-devel
>>>
>>>
>>>
>>
>> --
>> Aaron J. Mackey, Ph.D.
>> Project Manager, ApiDB Bioinformatics Resource Center
>> Penn Genomics Institute, University of Pennsylvania
>> email:  [hidden email]
>> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI)
>> fax:    215-746-6697
>> postal: Penn Genomics Institute
>>         Goddard Labs 212
>>         415 S. University Avenue
>>         Philadelphia, PA  19104-6017
>>
>>
>>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by:
> Power Architecture Resource Center: Free content, downloads,  
> discussions,
> and more. http://solutions.newsforge.com/ibmarch.tmpl
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>

--
Aaron J. Mackey, Ph.D.
Project Manager, ApiDB Bioinformatics Resource Center
Penn Genomics Institute, University of Pennsylvania
email:  [hidden email]
office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI)
fax:    215-746-6697
postal: Penn Genomics Institute
         Goddard Labs 212
         415 S. University Avenue
         Philadelphia, PA  19104-6017




-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: [Gmod-schema] Re: example data for comparative genomics analysis

Kara Dolinski
Hi,

It's on this page:

http://orthomcl.cbil.upenn.edu/cgi-bin/OrthoMclWeb.cgi?rm=orthomcl#FAQ

Maybe I misinterpreted, and the bit about GUS is a historical note.

On Oct 21, 2005, at 9:48 AM, Aaron J. Mackey wrote:

> Actually, the OrthoMCL DB site is not (where did it say that?), but  
> rather a custom, lightweight schema.
>
> -Aaron
>
> On Oct 21, 2005, at 9:22 AM, Kara Dolinski wrote:
>
>
>> Oops, sorry, after reading the docs on the OrthoMCL site, I see  
>> that it is using the GUS schema, so never mind.  :}
>>
>> On Oct 20, 2005, at 6:19 PM, Aaron J. Mackey wrote:
>>
>>
>>
>>>
>>> see also: orthomcl.cbil.upenn.edu
>>>
>>> for a public clustering of 55 predominantly-eukaryotic genomes.
>>>
>>> -Aaron
>>>
>>> On Oct 20, 2005, at 5:56 PM, Don Gilbert wrote:
>>>
>>>
>>>
>>>
>>>>
>>>> Kara,
>>>>
>>>> It happens I've got a set of OrthoMCL results for the 22 org-genome
>>>> sets* from mods and ensembl that Erik Sonnhammer used for  
>>>> Inparanoid this winter.
>>>> I've run the recipr. blastp of these (using Teragrid, another  
>>>> possible
>>>> MOD resource), then ran the orthomcl on them.   I've been spending
>>>> some time recently putting these into a version of chado (which  
>>>> however
>>>> is running in mysql).  It isn't quite complete but it is my hope  
>>>> to share
>>>> these all with anyone wanting them, thru eugenes.org.
>>>>
>>>> The source data and blastp results are now at
>>>> ftp://eugenes.org/eugenes/proteomes/   (blastp results inside  
>>>> blout/)
>>>> I'll take a look at state of chado-mysql database and probably
>>>> put a dump of that for public access in same eugenes/ folder.
>>>>
>>>> I'd be very happy to work with you and others interested in
>>>> a collaborative attack on this relatively large data set.  Just
>>>> managing all the gene/prot. IDs to get them straight (Ensembl uses
>>>> one set, NCBI another, Mods another ..) is a chore.  UniProt seems
>>>> to be the rosetta stone here.
>>>>
>>>> -- Don
>>>>
>>>> * source: http://inparanoid.cgb.ki.se/download/current/sequences/ 
>>>> original/   May 2005
>>>> Proteomes of model organisms contributed to InParanoid project  
>>>> by genome database
>>>> projects.
>>>>
>>>>
>>>>
>>>> -------------------------------------------------------
>>>> This SF.Net email is sponsored by:
>>>> Power Architecture Resource Center: Free content, downloads,  
>>>> discussions,
>>>> and more. http://solutions.newsforge.com/ibmarch.tmpl
>>>> _______________________________________________
>>>> Gmod-devel mailing list
>>>> [hidden email]
>>>> https://lists.sourceforge.net/lists/listinfo/gmod-devel
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Aaron J. Mackey, Ph.D.
>>> Project Manager, ApiDB Bioinformatics Resource Center
>>> Penn Genomics Institute, University of Pennsylvania
>>> email:  [hidden email]
>>> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI)
>>> fax:    215-746-6697
>>> postal: Penn Genomics Institute
>>>         Goddard Labs 212
>>>         415 S. University Avenue
>>>         Philadelphia, PA  19104-6017
>>>
>>>
>>>
>>>
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by:
>> Power Architecture Resource Center: Free content, downloads,  
>> discussions,
>> and more. http://solutions.newsforge.com/ibmarch.tmpl
>> _______________________________________________
>> Gmod-schema mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>
>>
>
> --
> Aaron J. Mackey, Ph.D.
> Project Manager, ApiDB Bioinformatics Resource Center
> Penn Genomics Institute, University of Pennsylvania
> email:  [hidden email]
> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI)
> fax:    215-746-6697
> postal: Penn Genomics Institute
>         Goddard Labs 212
>         415 S. University Avenue
>         Philadelphia, PA  19104-6017
>
>



-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema