1. 12 Dec, 2022 1 commit
    • Consolidate PRs into single branch (#219) · 65378d4d
      * Support xor_value in returned strings.
      
      Extend the tuple that represents an instance of a match to include the xor key.
      This breaks all existing scripts that are unpacking the tuple, which I'm not
      very happy with.
      
      This also updates the submodule to use the latest master so that I can get the
      new xor key values.
      
      Also, adds a fix to get yara building here by defining BUCKETS_128 and
      CHECKSUM_1B as needed by the new tlsh stuff (discussed with @metthal).
      
      * Add two new objects to yara-python.
      
      Add a StringMatch object, which represents a matched string. It has an
      identifier member (this is the string identifier, eg: $a) and an instances
      member which contains a list of matched string instances.
      
      It also keeps track of the string flags internally but does not expose them
      directly as the string flags contain things that are internal to YARA (eg:
      STRING_FLAGS_FITS_IN_ATOM). The reason it keeps track of the string modifiers
      is so that it can be extended to allow users to take action based upon certain
      flags. For example, there is a "is_xor()" member on StringMatch which will
      return True if the string is using the xor modifier. This way users can call
      another method (discussed below) to get the plaintext string back.
      
      Add a StringMatchInstance object which represents an instance of a matched
      string. It contains the offset, matched data and the xor key used to match the
      string (this is ALWAYS set, even to 0 if the string is not an xor string).
      
      There is a "plaintext()" method on the StringMatchInstance objects which will
      return a new bytes object with the xor key applied. This allows users to do
      something like this:
      
      ```
      print(instance.plaintext() if string.is_xor() else instance.matched_data)
      ```
      
      Technically, the plaintext() method will return the matched_data if the xor_key
      is 0 so they don't need to do the conditional but this allows them a nice way to
      know if the xor_key is worth recording along with the plaintext.
      
      I decided not to implement richcompare for these new objects as it isn't
      entirely clear what I would want to do the comparison on.
      
      * Add "matched_length" member.
      
      Add a "matched_length" member to match instances. This is useful when the
      "matched_data" member is a subset of the actually matched data.
      
      Add a test for this that sets the max_match_data config to 2 and then checks to
      make sure the "matched_length" and "matched_data" members are correct.
      
      * Add modules list to yara object.
      
      Add support for getting the list of available modules. It is available just by
      accessing the yara.modules attribute, which contains a list of available
      modules.
      
      >>> print('\n'.join(yara.modules))
      tests
      pe
      elf
      math
      time
      console
      >>>
      
      Note: This commit also brings in the necessary defines to build the authenticode
      parser, which is also done in the xor_value branch. Also, this commit updates
      the yara submodule which will likely overwrite the changes done in the xor_value
      so I recommend updating the submodule after both are merged.
      
      * Update yara to 65feab41d4cbf4a75338561d8506fc1fa9fa6ba6.
      
      * Fix test using \t in a regex.
      
      * Fix build on Windows in appveyor.
      
      * Actually fix appveyor builds on windows?
      Wesley Shields authored
  2. 24 Oct, 2022 1 commit
  3. 20 May, 2022 1 commit
    • Allow metadata to contain a list of values (#201) · d29ca083
      The `Rules.match` function now receives an optional `allow_duplicate_metadata=True` argument, which changes the structure of `Match.meta`. By default `Match.meta` is a dictionary with metadata names and their corresponding values, if a metadata name appears duplicated in a rule, the last value will be used. For example, consider the following rule:
      
      ```yara
      rule demo {
         meta: 
           foo = "foo #1"
           foo = "foo #2"
           bar = "bar"
         condition:
            false
      }
      ```
      
      In that case `Match.meta` would be `{"foo": "foo #2", "bar": "bar"}` by default (`allow_duplicate_metadata=False`), but with `allow_duplicate_metadata=True` it would be: `{"foo": ["foo #1", "foo #2"], "bar": ["bar"]}`. 
      cccs-rs authored
  4. 18 May, 2022 1 commit
    • Add a "warnings" member to Rules. (#208) · e14f096e
      When compiling rules that have warnings currently the only way to know they have
      warnings is to specify error_on_warning=True to yara.compile(). This will throw
      an exception that you can then check the warnings member of, like this:
      
      ```
      r = 'rule a { strings: $a = "a" condition: $a } rule b { strings: $b = "b" condition: $b }'
      
      try:
          rules = yara.compile(source=r, error_on_warning=True)
      except yara.WarningError as e:
          print(e.warnings)
      ```
      
      This stops the compilation process, so if you're trying to just know if there
      are warnings but still run the rules there is no good way to do it without using
      the exception mechanism and then compiling the rules a second time (with
      error_on_warning not set).
      
      This patch adds a warnings member to the compiled Rules object, which is always
      set to a list of warning strings. If you want to error on warning you can still
      use error_on_warning=True in yara.compile() and get the normal behavior, but if
      you just want to compile and know if there are warnings you can now use this new
      member without having to compile a second time.
      
      Suggested by: Tom Lancaster
      Fixes: #207
      Wesley Shields authored
  5. 04 Mar, 2022 2 commits
  6. 05 Oct, 2021 1 commit
  7. 16 Mar, 2021 1 commit
  8. 23 Feb, 2021 1 commit
    • Add support for exposing runtime warnings via a callback. (#160) · d20e6f16
      Add support for a "warnings_callback" argument to Rules.match(). If provided the
      function definition needs to be:
      
      def warnings_callback(warning_type, message)
      
      The callback will be called with a warning type of yara.WARNING_TOO_MANY_MATCHES
      and the message will be a string indicating which rule caused the warning. I
      think a warning type and a message is reasonably flexible in case we introduce
      other runtime warnings in the future.
      
      If a callback is not provided we print a warning on stderr using the normal
      python warning system. It's worth noting the function I'm using was introduced
      in python 3.2. I can switch it to something more portable if you don't want to
      pull support for 2.x yet.
      
      While I'm here, also chase the renaming of rules_list_head and other list
      variables so that it can compile with latest yara master.
      Wesley Shields authored
  9. 02 Sep, 2020 1 commit
    • Allow a Py_buffer as data for Rules_match (#152) · fa3795f9
      * Allow a Py_buffer as data for Rules_match
      
      This makes rules matching compatible with data objects
      `PyArg_ParseTuple` does not consider read-only (even though they might
      actually be), such a memoryviews. The main change is replacing the `s#`
      formatter with `s*` and replacing the `(pointer, length)` pair with a
      `Py_buffer` object accordingly. Additional care must be taken to release
      the `Py_buffer` on every error path.
      
      * Rules_match: zero-initialize data
      
      PyArg_ParseTupleAndKeywords does not initialize optional fields unless
      they are passed, which means we need to zero-initialize the data buffer
      to be sure the later NULL checks always work.
      
      This commit also gets rid of the unneeded has_data flag.
      
      * Add test for matching on a memoryview
      Jan Teske authored
  10. 12 Jun, 2020 1 commit
    • Fix issue #149. · cfd49c04
      This is regression in introduced in #140. When a string in the metadata section contains invalid UTF-8 characters the behavior Python 2 is leave the string exactly as it appears in YARA, in Python 3 however the invalid characters are removed because Python 3 strings are not handled as bytes like in Python 2, they most have a valid encoding. PR #140 was an attempt to homogenize the behavior in both versions of Python, but it introduced this other issue.
      Victor M. Alvarez authored
  11. 23 Apr, 2020 3 commits
    • Support a "is_global" and "is_private" member on Rules. (#130) · 5224381e
      * Support a "is_global" and "is_private" member on Rules.
      
      When writing linters it is currently impossible to know (via rule introspection)
      if a given rule is private or global. We have banned global rules for our use
      case and we have to resort to a janky regex against our rules files to know if
      anyone is about to commit a global rule. I figure exposing these two flags via
      python will be useful for programatically checking those bits.
      
      I'm not very pleased with the name "is_global" - I wanted to go with just
      "global" and "private" but "global" is a reserved keyword and rule.global breaks
      the python interpreter. I'm open to changing the member names if you have any
      suggestions.
      
      * Decrement reference counts on global and private.
      
      * Update global and private checks after API changes.
      Wesley Shields authored
    • Handle invalid unicode in metadata values. (#136) · bc4e0cdb
      * Handle invalid unicode in metadata values.
      
      In #135 it was brought up that you can crash the python interpreter if you have
      invalid unicode in a metadata value. This is my attempt to fix that by
      attempting to create a string, and if that fails falling back to a bytes object.
      On the weird chance that the bytes object fails to create I added a safety check
      so that we don't add a NULL ptr to the dictionary (this is how the crash was
      manifesting).
      
      It's debatable if we want to ONLY add strings as metadata, and NOT fallback to
      bytes. If we don't fall back to bytes the only other option I see is to silently
      drop that metadata on the floor. The tradeoff here is that now you may end up
      with a string or a bytes object in your metadata dictionary, which is less than
      ideal IMO.
      
      I'm open to suggestions on this one.
      
      Fixes #135
      
      * Add error handling to conversion to Unicode
      Metadata test accepts stripped or original characters
      
      * Remove 'or' clause from tests and add another NULL test check.
      
      Co-authored-by: malvidin <malvidin@gmail.com>
      Wesley Shields authored
  12. 21 Apr, 2020 1 commit
  13. 05 Dec, 2018 2 commits
  14. 03 Aug, 2018 1 commit
  15. 01 Aug, 2018 1 commit
  16. 30 Jul, 2018 1 commit
  17. 31 Oct, 2017 1 commit
    • Callback on include (#67) · 01bc8977
      * stable on python 2
      
      * Stable on python 2 and 3 (fixed utf-8 and ascii encoding issues)
       * Still needs compatible yara submodule update once pull request accepted
      
      * * Fixed all encoding issues
      
      * Proper error handling
      
      * Updating yara submodule to reference yara patched with include callback support
      
      * Updating submodule's branch
      
      * Updating yara submodule
      
      * Updating yara submodule
      
      * Updating yara submodule
      
      * Updating yara submodule
      
      * Submodule update
      
      * * Fixing memory leaks
      * Fixing errors handling
      
      * making error messages order consistent between include_callback and default yara behaviour
      
      * Removing exception printing when callback fails
      
      * Minor re-styling.
      
      * Destroy compiler if PyCallable_Check(include_callback) fails.
      
      * References to Py_None should be increased.
      
      * Use Py_DECREF instead of Py_XDECREF for references that can't be NULL.
      
      * Minor re-styling.
      
      * Fix reference leak.
      
      After calling Py_INCREF(include_callback) some code paths were leading to a return without calling Py_DECREF. Calling Py_INCREF before yr_compiler_set_include_callback is not necessary, as this function doesn't yield control to Python, but it should be called before yr_compiler_add_XX.
      
      * Remove unnecessary calls to Py_INCREF/Py_DECREF.
      
      The references were already incremented in yara_compile.
      
      * Implement test case for include callbacks
      
      * Point yara submodule to official repository.
      Victor M. Alvarez authored
  18. 25 Oct, 2017 1 commit
  19. 07 Oct, 2017 2 commits
  20. 29 Aug, 2017 1 commit
  21. 28 Aug, 2017 1 commit
  22. 16 May, 2017 1 commit
  23. 15 Jun, 2016 1 commit
  24. 31 Jan, 2016 1 commit
  25. 11 Nov, 2015 1 commit
  26. 23 Oct, 2015 1 commit
  27. 11 Sep, 2015 1 commit