GSOC #2019

[Integrate Cerberus] Final Work Report @ Google Summer of Code 2019

It’s 4 O’clock of a rainy Friday morning and one more Google Summer of Code has ended. My second to be exact. Time for me to hang in my boots, and start writing another report. Probably my last on this subject matter. There’s a lot to write.

(more…)
Advertisements

We are in the endgame now // Week 12

(more…)

Now, we test // Week 11

Week #11 31/08 to 06/08
(more…)

Integration Finally works out for good! // Week 10

Week #10 24/07 to 30/07

Integration finally worked!!

What did you do this week?

Integration finally worked!!

Well, didn’t do much this week except almost finishing the project. Here’s an informal take on how the week went. 

To be very frank, Julio. I haven’t had my fair share of practice with comprehensions in Python and this took a minute to figure out as did the entire test_pipelines.py and pipelines.pywhich took days to get through. This isn’t complex Python, it’s good code but there is just so much going on and I am not sure if the tests that I created are the best possible because I kept going back and forth between the code not able to figure what is the output from what function in this part of the code. As one can’t just throw logging statements and run the file or project that we normally do. And I just wanted to do it on my own at that point, because I thought a bit more effort into this last bit and things might get clearer. And they did. I am happy that I did the work that was needed.

At one point on a Sunday night, I just gave up and initialized the ItemValidationPipeline(), imported everything just to see what was going on line by line. Good hunting. I am happy that it worked out (Cerberus Integration), but not happy with the tests and would like to make it better. Codewise.

What is coming up next? 

We are left with unittests for the pipelines Cerberus integrated bit, documentation for the features and last but not least system testing. 

Did you get stuck anywhere?

I am not sure this question brings me joy to answer. 

So, I say yes!! I got stuck in a lot of places around this weekend. But, I am proud to say with the guidance of my mentors and some of my will. The confidence to debug the lines of code written this week was never broken, and will never be broken. Thank you everyone who helped!

Ramping up the integration // Week 9

Week #9 17/07 to 23/07

Well, integration isn’t working, and I am not giving up.

(more…)

Only 6 Weeks left // Week 8

Week #8 10/07 to 16/07

I just realized that there aren’t many weeks left. Good times like these should never end. 

What did you do this week?

Good news, my PR for the validator has finally been merged. I am proud of it, great things coming forward → https://github.com/vipulgupta2048/spidermon/pull/2

Worked on finishing up the Translator as well, we had a change in direction in how we are going ahead on writing the tests for that class. I feel with the guidance of Julio especially on figuring out how to think about writing better tests really helped me out. Also, something extremely useful that I realized with using TDD in my thinking and coding is that while testing only, I find several edge cases that I never would have thought about. Check this out.

*After testing*

> r"^required field$":messages.MISSING_REQUIRED_FIELD,

This message allows only “required field” string to be passed. Which is what is needed, and works great. Here’s the catch

*Earlier before testing I had,*

> r"required field":messages.MISSING_REQUIRED_FIELD,

Which led to the passing of all these string as well.

-  "not found required field"
- "aa required field aa"
- "required field almost anything here" and they all were getting translated.

Without testing, this would have lead to all kinds of troubles and TypeErrors. I am thankful to say the least, that testing has now become an integral part of my development work. Hence, the quote 

Good things happen when we test.

Vipul Gupta (2019-20)

What is coming up next? 

Start with the refactoring of the itemvalidation pipelines. Since that’s a more important task in hand. And now is priority one for the Team Cerberus.

Here’s the big feature missing that I will also be tackling.

Schema = {'quotes': {'type': ['string', 'list'], 'schema': {'type': 'string'}}}
Data = {'quotes': [1, 'Heureka!']} 


Error found while testing

TypeError: {'quotes': [{0: ['must be of string type']}]}

About this, this is something special with Cerberus.// – https://docs.python-cerberus.org/en/stable/validation-rules.html#type 

Context – To introduce some diversity into the tests, I added this type of schemas where you can have multiple parameters set to as `type` to your values, the `schema` key governs what actually would be the type of your internal schema.

About the Error – I actually added the comments there, because the error we are getting is actually a parsing problem with the `Validator.py` parent class.

Usually, the errors we get and we parse are in the form of {field_name:message}, but here we are getting {field_name: {Array_element: message}} which I think is causing a typeError and something previous developers didn’t account for since they never saw it coming with Cerberus. Cerberus is pretty good at showing detailed errors, hence I mentioned something related to not adding all the messages into the translator. But, this is something good that we caught here, as it would have never fit our use case in the future… Well, at least that’s what my theory is.

Did you get stuck anywhere?

Yep, and I have been communicating a lot more with my nimble questions. I feel quite better asking, answering and discussing problem. Glad to figure that one out from my 1st eval. Quite happy with the work that’s happening as well. 

 

Sticks and stones may break my tests … // Week 7

Week #7 03/07 to 09/07

(more…)

FIRST EVALUATION CLEARED // Week 6

Week #6 26/06 to 02/07

Well, I survived the first evaluation as you can all see. Made some mistakes along the way, recovered with the advice from my mentors and hopefully going strong into work period 2. Let’s talk shop, yes. 

(more…)

Meilenstein Ein beendet // Woche 5


Week #5 19/06 to 25/06

The first evaluation is here, got done with a milestone and took a small break for a personal event.

(more…)

They see me coding, they testin’ // Week 4

Week #4 12/06 to 18/06

Well, this has been another rather testing week. 

(more…)